Documento 3 - Linear regression with multiple regressors

Appunti relativi al terzo pacchetto di slide affrontati durante il corso di Econometria dalla professoressa Aparicio nell'anno 2023/24, i quali verranno ripresi dalla professoressa Borella nell'anno 2024/25

Esame Econometrics

Facoltà Economia

Dal corso del Prof. Aparicio Fenoll Aiona

Università Università degli studi di Torino

Publisher dimartinodaniel

A.A. 2023-2024

53 pagine

Schemi e mappe concettuali

Vota

Scarica

Estratto del documento

X

• If an omitted variable Z is both:

1. a determinant of Y (that is, it is contained in u); and ˆ

 



2. correlated with X , then 0 and the OLS estimator is biased

Xu 1

and is not consistent.

• For example, districts with few ESL students (1) do better on

standardized tests and (2) have smaller classes (bigger budgets),

so ignoring the effect of having many ESL students factor would

result in overstating the class size effect. Is this is actually going

on in the CA data?

The omitted variable bias formula: (2 of 2)

TABLE 6.1 Differences in Test Scores for California School Districts with Low and High

Student–Teacher Ratios, by the Percentage of English Learners in the District

Difference in Test Scores,

Student–Teacher Student–Teacher

Ratio ≥ 20 Low vs. High STR

Ratio < 20

Blank Average Average

Test Score n Test Score n Difference t-statistic

All districts 657.4 238 650.0 182 7.4 4.04

Percentage of Blank Blank Blank Blank Blank Blank

English learners −0.9 −0.30

< 1.9% 664.5 76 665.4 27

1.9–8.8% 665.2 64 661.8 44 3.3 1.13

8.8–23.0% 654.9 54 649.7 50 5.2 1.72

> 23.0% 636.7 44 634.8 61 1.9 0.68

• Districts with fewer English Learners have higher test scores

• Districts with lower percent EL (PctEL) have smaller classes

• Among districts with comparable PctEL, the effect of class size is small

(recall overall “test score gap” = 7.4)

Using regression to estimate causal effects

• The test score/STR/fraction English Learners example shows that, if an

omitted variable satisfies the two conditions for omitted variable bias, then

the OLS estimator in the regression omitting that variable is biased and

 

inconsistent. So, even if n is large, will not be close to .

1 1

• We have distinguished between two uses of regression: for prediction,

and to estimate causal effects.

– Regression also can be used simply to summarize the data without attaching any

meaning to the coefficients or for any other purpose, but we won’t focus on this

use.

• In the class size application, we clearly are interested in a causal effect:

what do we expect to happen to test scores if the superintendent reduces

the class size?

What, precisely, is a causal effect?

• “Causality” is a complex concept!

• In this course, we take a practical approach to defining causality:

A causal effect is defined to be the effect measured in an ideal

randomized controlled experiment.

Ideal Randomized Controlled Experiment

• –

Ideal: subjects all follow the treatment protocol perfect

compliance, no errors in reporting, etc.!

• Randomized: subjects from the population of interest are

randomly assigned to a treatment or control group (so there are

no confounding factors)

• Controlled: having a control group permits measuring the

differential effect of the treatment

• Experiment: the treatment is assigned as part of the experiment:

the subjects have no choice, so there is no “reverse causality” in

which subjects choose the treatment they think will work best.

Back to class size:

Imagine an ideal randomized controlled experiment for measuring

the effect on Test Score of reducing STR…

• In that experiment, students would be randomly assigned to

classes, which would have different sizes.

• Because they are randomly assigned, all student characteristics

(and thus u ) would be distributed independently of STR .

i i

• –

Thus, E(u |STR ) = 0 that is, LSA #1 holds in a randomized

i i

controlled experiment.

How does our observational data differ from

this ideal? (1 of 2)

• The treatment is not randomly assigned

• – –

Consider PctEL percent English learners in the district.

It plausibly satisfies the two criteria for omitted variable

bias: Z = PctEL is:

1. a determinant of Y; and

2. correlated with the regressor X.

• Thus, the “control” and “treatment” groups differ in a

≠

systematic way, so corr(STR,PctEL) 0

How does our observational data differ from

this ideal? (2 of 2)

• Randomization implies that any differences between the

–

treatment and control groups are random not systematically

related to the treatment

• We can eliminate the difference in PctEL between the large class

(control) and small class (treatment) groups by examining the

effect of class size among districts with the same PctEL.

– If the only systematic difference between the large and small class size

groups is in PctEL, then we are back to the randomized controlled

–

experiment within each PctEL group.

– This is one way to “control” for the effect of PctEL when estimating the

effect of STR.

Return to omitted variable bias

Three ways to overcome omitted variable bias

1. Run a randomized controlled experiment in which treatment (STR) is

randomly assigned: then PctEL is still a determinant of TestScore, but

PctEL is uncorrelated with STR. (This solution to OV bias is rarely

feasible.)

Adopt the “cross tabulation” approach, with finer gradations of

2. STR

–

and PctEL within each group, all classes have the same PctEL, so

we control for PctEL (But soon you will run out of data, and what

about other determinants like family income and parental education?)

3. Use a regression in which the omitted variable (PctEL) is no longer

omitted: include PctEL as an additional regressor in a multiple

regression.

The Population Multiple Regression Model

(SW Section 6.2)

• Consider the case of two regressors:

β β β = 1,…,n

Y = + X + X + u , i

i 0 1 1i 2 2i i

• Y is the dependent variable

• X , X are the two independent variables (regressors)

1 2

• th

(Y , X , X ) denote the i observation on Y, X , and X .

i 1i 2i 1 2

• β = unknown population intercept

• β = effect on Y of a change in X , holding X constant

1 1 2

• β = effect on Y of a change in X , holding X constant

2 2 1

• u = the regression error (omitted factors)

Interpretation of coefficients in multiple

regression (1 of 2)

β β β = 1,…,n

Y = + X + X + u , i

i 0 1 1i 2 2i i

Consider the difference in the expected value of Y for two values

of X holding X constant:

1 2

Population regression line when X = X :

1 1,0

β β β

Y = + X + X

0 1 1,0 2 2 ΔX

Population regression line when X = X + :

1 1,0 1

ΔY β β ΔX β

Y + = + (X + ) + X

0 1 1,0 1 2 2

Interpretation of coefficients in multiple

regression (2 of 2) 

β β ΔX

Before: Y = + (X + ) + X

0 1 1,0 1 2 2

ΔY β β ΔX β

After: Y + = + (X + ) + X

0 1 1,0 1 2 2

ΔY β ΔX

Difference: = 1 1

So: 

Y

  , holding X constant



1 2

X 1



Y

  , holding X constant



2 1

X 2

β = predicted value of Y when X = X = 0.

0 1 2

The OLS Estimator in Multiple Regression

(SW Section 6.3)

• With two regressors, the OLS estimator solves:

    2

min [

Y (

b b X b X )]

b , b , b i 0 1 1

i 2 2 i

0 1 2 

i 1

• The OLS estimator minimizes the average squared difference

between the actual values of Y and the prediction (predicted

value) based on the estimated line.

• This minimization problem is solved using calculus

• β β

This yields the OLS estimators of and .

0 1

Example: the California test score data

Regression of TestScore against STR:

  

TestScore 698.9 2.28 STR

Now include percent English Learners in the district (PctEL):

   

TestScore 686.0 1.10 STR 0.65 PctEL

• What happens to the coefficient on STR?

• What (STR, PctEL) = 0.19)

Multiple regression in STATA

reg testscr str pctel, robust;

Regression with robust standard errors Number of obs = 420

F( 2, 417) = 223.82

Prob > F = 0.0000

R-squared = 0.4264

Root MSE = 14.464

-----------------------------------------------------------------------------------------

| Robust

testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]

--------------+--------------------------------------------------------------------------

str | −1.101296 .4328472 −2.54 0.011 −1.95213 −.2504616

pctel | −.6497768 .0310318 −20.94 0.000 −.710775 −.5887786

_cons | 686.0322 8.728224 78.60 0.000 668.8754 703.189

-----------------------------------------------------------------------------------------

   

TestScore 686.0 1.10 STR 0.65 PctEL

ESS SSR

  

R 1 ,

TSS TSS

n n n

  

ˆ ˆ

    

2 2 2

where ESS (

Y Y ) , SSR u , TSS (

Y Y ) .

i i i

  

i 1 i 1 i 1

• 2

The R always increases when you add another regressor

– a bit of a problem for a measure of “fit”

(why?)

2 2 2

R and R (adjusted R ) (2 of 2)

The (the “adjusted ”) corrects this problem by “penalizing”

2 2

R R  2

you for including another regressor the R does not necessarily

increase when you add another regressor.



 

n 1 SSR

 

2 2  

Adjusted R : R 1  

 

n k 1 TSS



2 2

Note that R R , however if n is large the two wi

Anteprima

Vedrai una selezione di 12 pagine su 53