Econ 265: Introduction to Econometrics

Topic 4: Multiple Linear Regression

Moshi Alam

Introduction

The idea here would be extend the concepts discussed simple linear regression
Again, it is crucial that you make yourself comfortable with the concepts of simple linear regression before moving on here
Consequently, we will be much faster than the previous topic
Once again the big picture will be to extend the assumptions revolving around \(u_i\) and \(x_i\) to multiple variables
- to understand how does the OLS estimator plays out

Multiple Linear Regression

The multiple linear regression model is given by: \[y_i = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + \ldots + \beta_k x_{ki} + u_i\]
- \(i = 1, 2, \ldots, n\) indexes the observations
- \(y_i\) is the dependent variable,
- \(x_{1i}, x_{2i}, \ldots, x_{ki}\) are the independent variables,
- \(\beta_0, \beta_1, \ldots, \beta_k\) are the parameters
- \(u_i\) is the error term
The key assumption will be a generalized versions in SLR \[E(u_i|x_{1i}, x_{2i}, \ldots, x_{ki}) = 0\] \[E(u_i) = 0\]
These imply: \(E(x_{1i}u_i) = E(x_{2i}u_i) = \ldots = E(x_{ki}u_i) = 0\)

Obtaining the OLS estimates

Once again we will minimize the sum of squared residuals to estimate the parameters as \(\hat{\beta}_0, \hat{\beta}_1, \ldots, \hat{\beta}_k\)
Specifically, we will choose \(\hat{\beta}_0, \hat{\beta}_1, \ldots, \hat{\beta}_k\) to minimize the following: \[\sum_{i=1}^n (y_i - \hat{y}_i)^2 = \sum_{i=1}^n (y_i - \hat{\beta}_0 - \hat{\beta}_1 x_{1i} - \hat{\beta}_2 x_{2i} - \ldots - \hat{\beta}_k x_{ki})^2\]

This minimization problem leads to \(k+1\) linear equations (F.O.C.s) in \(k+1\) unknowns \(\hat{\boldsymbol{\beta}}_0, \hat{\boldsymbol{\beta}}_1, \ldots, \hat{\boldsymbol{\beta}}_k\) :

\[ \begin{aligned} & \sum_{i=1}^n\left(y_i-\hat{\beta}_0-\hat{\beta}_1 x_{i 1}-\cdots-\hat{\beta}_k x_{i k}\right)=0 \\ & \sum_{i=1}^n x_{i 1}\left(y_i-\hat{\beta}_0-\hat{\beta}_1 x_{i 1}-\cdots-\hat{\beta}_k x_{i k}\right)=0 \\ & \sum_{i=1}^n x_{i 2}\left(y_i-\hat{\beta}_0-\hat{\beta}_1 x_{i 1}-\cdots-\hat{\beta}_k x_{i k}\right)=0 \\ & \vdots \\ & \sum_{i=1}^n x_{i k}\left(y_i-\hat{\beta}_0-\hat{\beta}_1 x_{i 1}-\cdots-\hat{\beta}_k x_{i k}\right)=0 \end{aligned} \]

The FOCs

The FOCs give us \(k+1\) equations in \(k+1\) unknowns
Note that we arrive at this because we assume that any change in \(x_{1i}\), \(x_{2i}\), \(\ldots\), \(x_{ki}\) will not affect the error term \(u_i\)
Gives us the sample counterparts of \(k+1\) moment conditions
- Recall we did something similar in SLR
The moment conditions are given by: \[\begin{aligned} E(u_i) &= 0 \\ E(x_{1i}u_i) &= 0 \\ E(x_{2i}u_i) &= 0 \\ \vdots \\ E(x_{ki}u_i) &= 0 \end{aligned}\]
Note that we arrive at this because we assume that any change in \(x_{1i}\), \(x_{2i}\), \(\ldots\), \(x_{ki}\) will not affect the error term \(u_i\)

library(wooldridge)
wage_data <- wage1
# run a regression of log(wage) on educ, exper, and tenure
reg1 <- lm(log(wage) ~ educ + exper + tenure, data = wage_data)
summary(reg1)


Call:
lm(formula = log(wage) ~ educ + exper + tenure, data = wage_data)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.05802 -0.29645 -0.03265  0.28788  1.42809 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.284360   0.104190   2.729  0.00656 ** 
educ        0.092029   0.007330  12.555  < 2e-16 ***
exper       0.004121   0.001723   2.391  0.01714 *  
tenure      0.022067   0.003094   7.133 3.29e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4409 on 522 degrees of freedom
Multiple R-squared:  0.316, Adjusted R-squared:  0.3121 
F-statistic: 80.39 on 3 and 522 DF,  p-value: < 2.2e-16

keeping educ fixed, what is the effect of changing exper and tenure by one year simultaneously?

# keeping educ fixed, what is the effect of changing exper and tenure by one year simultaneously?

reg1$coefficients[3] + reg1$coefficients[4]

     exper 
0.02618833

It returns an object of type Named num
Also not easy to read code
Let us be complete and verbose

beta_exper <- reg1$coefficients[3] 
beta_tenure <-  reg1$coefficients[4] 
total_effect <- as.numeric(beta_exper + beta_tenure)
total_effect

[1] 0.02618833

Residual and fitted values

The fitted/predicted values are given by: \[\hat{y}_i = \hat{\beta}_0 + \hat{\beta}_1 x_{1i} + \hat{\beta}_2 x_{2i} + \ldots + \hat{\beta}_k x_{ki}\]
The residuals are given by: \[\hat{u}_i = y_i - \hat{y}_i\] \[= y_i - \hat{\beta}_0 - \hat{\beta}_1 x_{1i} - \hat{\beta}_2 x_{2i} - \ldots - \hat{\beta}_k x_{ki}\]

# Fitted values
fitted_values <- reg1$fitted.values
head(fitted_values)

       1        2        3        4        5        6 
1.304921 1.523506 1.304921 1.819802 1.461690 1.970451

# Residuals
residuals <- reg1$residuals
head(residuals)

          1           2           3           4           5           6 
-0.17351855 -0.34793290 -0.20630834 -0.02804287  0.20601726  0.19860264

# manual residuals
observed_wage <- log(wage_data$wage)
manual_residuals <- observed_wage - fitted_values
# test if manual residuals are approximately the same as residuals
all.equal(residuals, manual_residuals)

[1] TRUE

Properties of fitted values and residuals

Immediate extension of SLR and implications from the moment conditions:

Sample average of residuals is zero: \(\sum_{i=1}^n \hat{u}_i = 0\)
Sample covariance between residuals and each of the independent variables is zero: \(\sum_{i=1}^n x_{1i} \hat{u}_i = 0\) \(\sum_{i=1}^n x_{2i} \hat{u}_i = 0\) \(\vdots\) \(\sum_{i=1}^n x_{ki} \hat{u}_i = 0\)
The point \((\bar{x}_1, \bar{x}_2, \ldots, \bar{x}_k, \bar{y})\) lies on the regression plane \[\bar{y} = \hat{\beta}_0 + \hat{\beta}_1 \bar{x}_1 + \hat{\beta}_2 \bar{x}_2 + \ldots + \hat{\beta}_k \bar{x}_k\]

Partialing out interpretation

Consider a case with k=2, i.e., two independent variables.

The population model is given by: \[y_i = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + u_i\]

Let us focus on \(\beta_1\)

# run a regression of log(wage) on educ, exper
reg_wage_on_educ_exp <- lm(log(wage) ~ educ + exper, data = wage_data)
summary(reg_wage_on_educ_exp)


Call:
lm(formula = log(wage) ~ educ + exper, data = wage_data)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.05800 -0.30136 -0.04539  0.30601  1.44425 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.216854   0.108595   1.997   0.0464 *  
educ        0.097936   0.007622  12.848  < 2e-16 ***
exper       0.010347   0.001555   6.653 7.24e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4614 on 523 degrees of freedom
Multiple R-squared:  0.2493,    Adjusted R-squared:  0.2465 
F-statistic: 86.86 on 2 and 523 DF,  p-value: < 2.2e-16

# step 1: run a regression of log(wage) on exper
reg_logwage_on_exp <- lm(log(wage) ~ exper, data = wage_data)
residuals_logwage_on_exp <- reg_logwage_on_exp$residuals
# step 2: run a regression of educ on exper
reg_educ_on_exp <- lm(educ ~ exper, data = wage_data)
residuals_educ_on_exp <- reg_educ_on_exp$residuals
# step 3: run a regression of residuals from step 1 on residuals from step 2
partial_out_reg <- lm(residuals_logwage_on_exp ~ residuals_educ_on_exp)
summary(partial_out_reg)


Call:
lm(formula = residuals_logwage_on_exp ~ residuals_educ_on_exp)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.05800 -0.30136 -0.04539  0.30601  1.44425 

Coefficients:
                        Estimate Std. Error t value Pr(>|t|)    
(Intercept)           -2.192e-18  2.010e-02    0.00        1    
residuals_educ_on_exp  9.794e-02  7.615e-03   12.86   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.461 on 524 degrees of freedom
Multiple R-squared:  0.2399,    Adjusted R-squared:  0.2385 
F-statistic: 165.4 on 1 and 524 DF,  p-value: < 2.2e-16

What’s going on?

The population model is given by: \[y_i = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + u_i\]
To get the effect of \(x_{1i}\) on \(y_i\), the MLR is equivalent to the following two-step procedure:
- Step 1: Regress \(y_i\) on \(x_{2i}\) and obtain the residuals \(\hat{e}_i\)
  - \(\hat{e}_i\) has the variation left in \(y_i\) after partialling out variation from \(x_{2i}\)
- Step 2: Regress \(x_{1i}\) on \(x_{2i}\) and obtain the residuals \(\hat{v}_i\)
  - \(\hat{v}_i\) has the variation left in \(x_{1i}\) after partialling out variation from \(x_{2i}\)
- Step 3: Regress \(\hat{e}_i\) on \(\hat{v}_i\)
The idea is to partial out the effect of \(x_{2i}\) from \(x_{1i}\) and \(y_i\)
- This is the same as regressing \(y_i\) on \(x_{1i}\) and \(x_{2i}\)
Hence comes the interpretation of \(\beta_1\) as the effect of \(x_{1i}\) on \(y_i\) keeping \(x_{2i}\) fixed
There is another way to approach this. See textbook 3-2f and implement it in R

Comparison of Simple and Multiple Regression Estimates

The estimated multiple regression can be written as: \[\hat{y_i} = \hat{\beta}_0 + \hat{\beta}_1 x_{1i} + \hat{\beta}_2 x_{2i}\]
Suppose we only ran the simple regression of \(y\) on \(x_{1i}\): \[\tilde{y_i} = \tilde{\beta}_0 + \tilde{\beta}_1 x_{1i}\]
Simple regression coefficient \(\tilde{\beta}_1\) relates to the multiple regression coefficient \(\hat{\beta}_1\) via: \[\tilde{\beta}_1 = \hat{\beta}_1 + \hat{\beta}_2 \delta_1\] where \(\delta_1\) is the slope from the regression of \(x_{2i}\) on \(x_{1i}\) from estimating: \[x_{2i} = \delta_0 + \delta_1 x_{1i} + \epsilon\]

Key Insight

Confounding term:
- \(\hat{\beta}_2 \delta_1\): The partial effect of \(x_{2i}\) on \(\hat{y}\) scaled by the relationship between \(x_{2i}\) and \(x_{1i}\).
Two distinct cases where \(\tilde{\beta}_1 = \hat{\beta}_1\):
- The partial effect of \(x_{2i}\) on \(\hat{y}\) is zero: \(\hat{\beta}_2 = 0\).
- \(x_{1i}\) and \(x_{2i}\) are uncorrelated: \(\delta_1 = 0\).
Thus generically in a MLR of \(y_i\) on \(x_{1i}, \cdots, x_{ki}\),
- the OLS estimate on the coefficient on \(x_{1i}\)
- will be identical to one obtained from a simple regression of \(y_i\) on \(x_{1i}\)
- iff \(x_{1i}\) and \(x_{2i}, \cdots, x_{ki}\) are uncorrelated.
Example of ability bias in wage regressions

Goodness of Fit

The idea of the \(R^2\) statistic in MLR is the same as in SLR
- The proportion of the total variation in \(y_i\) that is explained by the independent variables in the model \[R^2 = \frac{SSE}{SST} = 1 - \frac{SSR}{SST}\], where:
  - \(SSE = \sum_{i=1}^n (\hat{y}_i - \bar{y})^2\) is the explained sum of squares
  - \(SSR = \sum_{i=1}^n \hat{u}_i^2\) is the sum of squared residuals
  - \(SST = \sum_{i=1}^n (y_i - \bar{y})^2\) is the total sum of squares
Always between 0 and 1

Assumptions of MLR

MLR.1: Linearity: The population model \(y_i = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + \ldots + \beta_k x_{ki} + u_i\) is linear in the parameters \(\beta_0, \beta_1, \ldots, \beta_k\)
MLR.2: Random Sampling: The data \((y_i, x_{1i}, x_{2i}, \ldots, x_{ki})\) for \(i = 1, 2, \ldots, n\) are a random sample from the population
MLR.3: No Perfect Collinearity: The X’s are not perfectly collinear

They are not a linear combination of one another

MLR.4: Zero Conditional Mean: \(E(u_i|x_{1i}, x_{2i}, \ldots, x_{ki}) = 0\)
- This implies that \(E(u_i) = 0\) and \(E(x_{ji}u_i) = 0\) for all \(j = 1, 2, \ldots, k\)
- If this holds then the \(x_{ji}\) are called strictly exogeneous

Theorem: Unbiasedness of the OLS estimator in MLR

Under Assumptions MLR.1-MLR.4, the OLS estimator is unbiased. That is, \(E(\hat{\beta}_j) = \beta_j\) for \(j = 0, 1, \ldots, k\).

Variance of the OLS estimator

Homoskedasticity assumption

Having an estimate is of no use without a measureof its precision.
Similar to SLR, we will make an assumption on the Variance of the error term \(u_i\) conditional on the independent variables \(x_{1i}, x_{2i}, \ldots, x_{ki}\):
MLR.5: Homoskedasticity: \(Var(u_i|x_{1i}, x_{2i}, \ldots, x_{ki}) = \sigma^2\)
- This implies that \(Var(u_i) = \sigma^2\)
Using the popoulation model \(y_i = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + \ldots + \beta_k x_{ki} + u_i\), we can take conditional variance of both sides to get: \[Var(y_i|x_{1i}, x_{2i}, \ldots, x_{ki}) = Var(u_i|x_{1i}, x_{2i}, \ldots, x_{ki}) = \sigma^2\]
Why do we get this? Recall the formula of a variance of sums of objects.
Again as before we do not know \(\sigma^2\) but we can estimate it using the residuals similar to SLR.
But first lets look at the sampling variance of the OLS estimates

Sampling variance of OLS estimator

Theorem: Sampling variance of OLS estimator in MLR

Under assumptions MLR.1-MLR.5, the sampling variance of the OLS estimate on the \(j^{th}\) coefficient \(\hat{\beta}_j\) is given by: \[Var(\hat{\beta}_j) = \frac{\sigma^2}{(1 - R_j^2)\sum_{i=1}^n (x_{ji} - \bar{x}_j)^2}\] for \(j = 0, 1, \ldots, k\). where:

\(\sigma^2\) is the variance of the error term \(u_i\) conditional on the independent variables
\(\bar{x}_j\) is the sample mean of \(x_{ji}\)
\(\sum_{i=1}^n (x_{ji} - \bar{x}_j)^2\) is the total sample variance of \(x_{ji}\)
\(R_j^2\) is the \(R^2\) from regressing \(x_{ji}\) on all other independent variables and an intercept.

Observe how each component matters in the formula and why each of them important.
Look for connections with MLR.3

The role of \(R_j^2\)

\(R_j^2\) is the \(R^2\) from regressing \(x_{ji}\) on all other independent variables and an intercept.
What happens to \(Var(\hat{\beta}_j)\) as \(R_j^2\) approaches 1?
What does it mean when \(R_j^2\) is very close to 1 versus equal to 1?
- Multicollinearity VS Perfect Collinearity
- - Definition: Multicollinearity arises when independent variables in a regression model are highly correlated, making it difficult to estimate the effect of each variable accurately.
Example: Estimating the effect of different school expenditure categories (e.g., teacher salaries, materials, athletics) on student performance.
Challenge:
- Wealthier schools spend more across all categories, leading to high correlations among variables.
- Difficult to estimate the partial effect of one category due to lack of variation.
Solution: Consider combining correlated variables into a single index.

Example: Final Exam Scores

Scenario: Predicting final exam scores using:
- Key Variable: Number of classes attended.
- Other control Variables: Cumulative GPA, SAT score, and high school performance.
Concern: Other control variables are highly correlated (multicollinear).
A useful exercise is to look at the standard errors of the OLS estimates.

Completing the model

Now thatw e have laid out the assumptions for unbiasedness, and homoskedasticity for precision
- We are left to estimate the variance of the error term \(\sigma^2\) to complete the model
Simialr to SLR, we can estimate \(\sigma^2\) using the residuals: \[\hat{\sigma}^2 = \frac{1}{n - k - 1} \sum_{i=1}^n \hat{u}_i^2\]
The degrees of freedom is \(n - k - 1\) because we have \(n\) observations and \(k+1\) parameters to estimate

Theorem:

Under assumptions MLR.1-MLR.5, \(E(\hat{\sigma}^2) = \sigma^2\)

\(\hat{\sigma}\) is the standard error of the regression (SER)

Gauss-Markov Theorem

The Gauss-Markov Theorem is a key result in econometrics that states that under the assumptions MLR.1-MLR.5, the OLS estimator is the best linear unbiased estimator (BLUE) of the population parameters.
Best: The OLS estimator has the smallest variance among all linear unbiased estimators.
Linear: The OLS estimator is a linear function of the dependent variable.
Unbiased: The OLS estimator is unbiased.
The OLS estimator is the most efficient estimator among all unbiased estimators.

Standard errors of \(\hat{\beta}_j\)

Under homoskedasticity, the standard error of the OLS estimate \(\hat{\beta}_j\) is given by: \[SE(\hat{\beta}_j) = \sqrt{Var(\hat{\beta}_j)}\]
an estimate of the standard deviation of the sampling distribution of \(\hat{\beta}_j\).
a measure of the precision of the OLS estimate \(\hat{\beta}_j\).
used to construct confidence intervals and conduct hypothesis tests.
estimated using the formula: \[SE(\hat{\beta}_j) = \sqrt{\frac{\hat{\sigma}^2}{(1 - R_j^2)\sum_{i=1}^n (x_{ji} - \bar{x}_j)^2}} = \sqrt{\frac{\hat{\sigma}^2}{(1 - R_j^2)\sqrt{n} \quad sd(x_j)}}\]
Notice the role of \(R_j^2\) and \(n\) in the formula.

Ommitted variable bias

Recall

Recall the example we did in class with the wage data on educ and exper
For the true population model satisfying MLR1-MLR4: \[y_i = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + u_i\]
Suppose we omitted \(x_{2i}\) and only ran the simple regression of \(y_i\) on \(x_{1i}\): \[\tilde{y_i} = \tilde{\beta}_0 + \tilde{\beta}_1 x_{1i}\]
\(\tilde{\beta}_1\) relates to \(\hat{\beta}_1\) via: \[\tilde{\beta}_1 = \hat{\beta}_1 + \hat{\beta}_2 \delta_1\] where \(\delta_1\) is the slope from the regression of \(x_{2i}\) on \(x_{1i}\) from estimating: \[x_{2i} = \delta_0 + \delta_1 x_{1i} + \epsilon\]
So \(E(\tilde{\beta}_1) = \beta_1 + \beta_2 \delta_1\). Bias = \(E(\tilde{\beta}_1) - \beta_1 = \beta_2 \delta_1\)
Let us generalize this to a three variable case

Omitted variable bias

For the true population modelsatisfying MLR1-MLR4: \[y_i = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + \beta_3 x_{3i} + u_i\] - Suppose \(x_{3i}\) and \(x_{2i}\) are uncorrelated but \(x_{3i}\) and \(x_{1i}\) are correlated - Suppose we omitted \(x_{3i}\) and only ran the regression of \(y_i\) on \(x_{1i}\) and \(x_{2i}\): \[\tilde{y_i} = \tilde{\beta}_0 + \tilde{\beta}_1 x_{1i} + \tilde{\beta}_2 x_{2i}\]

Might be tempting to think that \(\tilde{\beta}_2\) is unbiased and only \(\tilde{\beta}_1\) is biased
But both will be biased because of the induced correlation between \(x_{1i}\) and \(x_{2i}\) when \(x_{3i}\) is omitted
\(\tilde{\beta}_2\) is unbiased only when\(x_{1i}\) and \(x_{2i}\) are uncorrelated. Then we can show that: \[\mathrm{E}\left(\widetilde{\beta}_1\right)=\beta_1+\beta_3 \frac{\sum_{i=1}^n\left(x_{i 1}-\bar{x}_1\right) (x_{i 3} - \bar{x_3})}{\sum_{i=1}^n\left(x_{i 1}-\bar{x}_1\right)^2}\]

Ability bias example

The true population model is given by: \[log(wage_i) = \beta_0 + \beta_1 educ_i + \beta_2 exper_i + \beta_3 ability_i + u_i\]
Suppose \(exper_i\) and \(ability_i\) are uncorrelated (probably not true)

Suppose we omitted the variable \(ability_i\) and only ran the regression of \(log(wage_i)\) on \(educ_i\) and \(exper_i\): \[\tilde{log(wage_i)} = \tilde{\beta}_0 + \tilde{\beta}_1 educ_i + \tilde{\beta}_2 exper_i\]
Both \(\tilde{\beta}_1\) and \(\tilde{\beta}_2\) will be biased
- Even if we assume suppose \(exper_i\) and \(ability_i\) are uncorrelated
The bias in \(\tilde{\beta}_2\) is due to the induced correlation between \(educ_i\) and \(exper_i\) when \(ability_i\) is omitted because \(educ_i\) and \(ability_i\) are correlated
Confounding the effect of ability. Direction of the bias?
So imagine the further complexity if we assumed all variables were pairwise correlated.

Variances in misspecified models

We saw what happens to bias. What happens to precision?
True model: \(y_i = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + u_i\) and \(x_{1i}\) and \(x_{2i}\) are correlated
Estimate two models:
- Model 1: \(\hat{y_i} = \hat{\beta}_0 + \hat{\beta}_1 x_{1i} + \hat{\beta}_2 x_{2i}\)
- Model 2: \(\tilde{y_i} = \tilde{\beta}_0 + \tilde{\beta}_1 x_{1i}\)

When \(\beta_2 \neq 0, \widetilde{\beta}_1\) is biased, \(\hat{\beta}_1\) is unbiased, and \(\operatorname{Var}\left(\widetilde{\beta}_1\right)<\operatorname{Var}\left(\hat{\beta}_1\right)\)
When \(\beta_2=0, \widetilde{\beta}_1\) and \(\hat{\beta}_1\) are both unbiased, and \(\operatorname{Var}\left(\widetilde{\beta}_1\right)<\operatorname{Var}\left(\hat{\beta}_1\right)\)

This is because (recall the formula for the variance of the OLS estimator):
- \(\operatorname{Var}\left(\hat{\beta}_1\right)=\sigma^2 /\left[\operatorname{SST}_1\left(1-R_1^2\right)\right]\)
- \(\operatorname{Var}\left(\widetilde{\beta}_1\right)=\sigma^2 /\left[\operatorname{SST}_1\right]\)
Take away: Including irrelevant variables in the model “may” not affect the unbiasedness of the OLS estimator, but it will affect the precision of the estimates.

Econ 265: Introduction to Econometrics

Introduction

Multiple Linear Regression

Obtaining the OLS estimates

The FOCs

Residual and fitted values

Properties of fitted values and residuals

Partialing out interpretation

What’s going on?

Comparison of Simple and Multiple Regression Estimates

Key Insight

Goodness of Fit

Assumptions of MLR

Variance of the OLS estimator

Homoskedasticity assumption

Sampling variance of OLS estimator

The role of \(R_j^2\)

More on multicollinearity

Example: Final Exam Scores

Completing the model

Gauss-Markov Theorem

Standard errors of \(\hat{\beta}_j\)

Ommitted variable bias

Recall

Omitted variable bias

Ability bias example

Variances in misspecified models