Che materia stai cercando?



Heteroskedasticity in a picture:

• E(u|X=x) = 0 (u satisfies Least Squares Assumption #1)

• The variance of u does depends on x: u is heteroskedastic. 23

A real-data example from labor economics: average hourly earnings vs.

years of education (data source: Current Population Survey):

Heteroskedastic or homoskedastic? 24

The class size data:

Heteroskedastic or homoskedastic? 25

So far we have (without saying so) assumed that u might be


Recall the three least squares assumptions:

1. E(u|X = x) = 0

,Y ), i =1,…,n, are i.i.d.

2. (X i i

3. Large outliers are rare

Heteroskedasticity and homoskedasticity concern var(u|X=x). Because

we have not explicitly assumed homoskedastic errors, we have

implicitly allowed for heteroskedasticity. 26

What if the errors are in fact homoskedastic?

• You can prove that OLS has the lowest variance among estimators

that are linear in Y… a result called the Gauss-Markov theorem that

we will return to shortly. β

• ˆ and the OLS standard error

The formula for the variance of 1 σ 2

simplifies (pp. 4.4): If var(u |X =x) = , then

i i u µ

µ −

− 2 2

E [( X ) u ]

var[( X ) u ]


ˆ i x i

i x i

) = =

var( σ


1 2 2

2 2 n ( )

n ( ) X


σ 2


= σ 2

n X



Note: var( ) is inversely proportional to var(X): more spread in X

1 β

ˆ - we discussed this earlier but it is

means more information about 1

clearer from this formula. 27

Along with this homoskedasticity-only formula for the variance of


ˆ , we have homoskedasticity-only standard errors:


Homoskedasticity-only standard error formula:


1 ∑ 2



− i

1 n 2

β × =

ˆ i 1

) = .

SE( 1 n


n ∑ − 2

( X X )


n =



Remember that OLS estimator minimizes SSR, i.e. finds those

parameter estimates which deliver residuals that have min(SSR),



therefore OLS has minimum SE( ) if errors are homoskedstic

1 28



We now have two formulas for standard errors for .


• Homoskedasticity-only standard errors – these are valid only if the

errors are homoskedastic.

• The usual standard errors – to differentiate the two, it is conventional

to call these heteroskedasticity – robust standard errors, because

they are valid whether or not the errors are heteroskedastic.

• The main advantage of the homoskedasticity-only standard errors is

that the formula is simpler. But the disadvantage is that the formula

is only correct in general if the errors are homoskedastic. 29

Practical implications… β

• ˆ and

The homoskedasticity-only formula for the standard error of 1

the “heteroskedasticity-robust” formula differ – so in general, you

get different standard errors using the different formulas.

• Homoskedasticity-only standard errors are the default setting in

regression software. To get the general “heteroskedasticity-robust”

standard errors you must override the default.

• If you don’t override the default and there is in fact

heteroskedasticity, your standard errors (and wrong t-statistics

and confidence intervals) will be wrong – typically,

homoskedasticity-only SEs are too small. 30

The bottom line:

• If the errors are either homoskedastic or heteroskedastic and you use

heteroskedastic-robust standard errors, you are OK

• If the errors are heteroskedastic and you use the homoskedasticity-

only formula for standard errors, your standard errors will be wrong



(the homoskedasticity-only estimator of the variance of is


inconsistent if there is heteroskedasticity).

• The two formulas coincide (when n is large) in the special case of


• So, you should always use heteroskedasticity-robust standard

errors! 31

Some Additional Theoretical Foundations of OLS

(Section 5.5)

We have already learned a very great deal about OLS: OLS is

unbiased and consistent; we have a formula for heteroskedasticity-

robust standard errors; and we can construct confidence intervals and

test statistics.

Also, a very good reason to use OLS is that everyone else does – so

by using it, others will understand what you are doing. In effect, OLS is

the language of regression analysis, and if you use a different estimator,

you will be speaking a different language. 32

Still, some of you may have further questions:

• Is this really a good reason to use OLS? Aren’t there other

estimators that might be better – in particular, ones that might have a

smaller variance?

• Also, what ever happened to our old friend, the Student t


So we will now answer these questions – but to do so we will need to

make some stronger assumptions than the three least squares

assumptions already presented. 33

The Extended Least Squares Assumptions

These consist of the three LS assumptions, plus two more:

1. E(u|X = x) = 0.

,Y ), i =1,…,n, are i.i.d.

2. (X i i ∞, ∞).

4 4

3. Large outliers are rare (E(Y ) < E(X ) <

4. u is homoskedastic σ 2


5. u is distributed N(0,

• Assumptions 4 and 5 are more restrictive – so they apply to fewer

cases in practice. However, if you make these assumptions, then

certain mathematical calculations simplify and you can prove strong

results – results that hold if these additional assumptions are true.

• We start with a discussion of the efficiency of OLS 34

Efficiency of OLS, part I: The Gauss-Markov Theorem

• Under extended LS assumptions 1-4 (the basic three, plus



homoskedasticity), has the smallest variance among all linear

1 ,…, Y ). This is

estimators (estimators that are linear functions of Y 1 n

the Gauss-Markov theorem (proven in SW Appendix 5.2). 35

The Gauss-Markov Theorem, ctd.


• ˆ is a linear estimator, that is, it can be written as a linear function of


Y ,…, Y :

1 n n

∑ −

( X X ) u

i i n

1 ∑

β β

β =

ˆ i 1

= + = + ,

w u

1 1 i i

1 n n

∑ − =

2 i 1

( X X )



i 1

( X X )


= .

where w i n

1 ∑ − 2

( X X )


n =

i 1

• The G-M theorem says that among all possible choices of {w }, the




OLS weights yield the smallest var( 1


Efficiency of OLS, part II:

• extended LS assumptions – including normally

Under all five β


distributed errors – has the smallest variance of all consistent

1 →∞.

,…,Y ), as n

estimators (linear or nonlinear functions of Y 1 n

• This is a pretty amazing result – it says that, if (in addition to LSA 1-

3) the errors are homoskedastic and normally distributed, then OLS is

a better choice than any other consistent estimator. And because an

estimator that isn’t consistent is a poor choice, this says that OLS

really is the best you can do – if all five extended LS assumptions

hold. 37

Some not-so-good thing about OLS

The foregoing results are impressive, but these results – and the OLS

estimator – have important limitations.

1. The GM theorem really isn’t that compelling:

• The condition of homoskedasticity often doesn’t hold

(homoskedasticity is special)

• The result is only for linear estimators – only a small subset of

estimators (more on this in a moment)

2. The strongest optimality result (“part II” above) requires

homoskedastic normal errors – not plausible in applications (think

about the hourly earnings data!) 38

Limitations of OLS, ctd.

3. OLS is more sensitive to outliers than some other estimators. In the

case of estimating the population mean, if there are big outliers, then

the median is preferred to the mean because the median is less

sensitive to outliers – it has a smaller variance than OLS when there

are outliers. Similarly, in regression, OLS can be sensitive to outliers,

and if there are big outliers other estimators can be more efficient

(have a smaller variance). One such estimator is the least absolute

esercitazione 2) estimator:

deviations (LAD--> n

∑ − +

min Y ( b b X )

b , b i 0 1 i

0 1 =

i 1

In virtually all applied regression analysis, OLS is used – and that is

what we will do in this course too. 39

Inference if u is Homoskedastic and Normal:

the Student t Distribution (Section 5.6)

Recall the five extended LS assumptions:

1. E(u|X = x) = 0.

,Y ), i =1,…,n, are i.i.d.

2. (X i i ∞, ∞).

4 4

3. Large outliers are rare (E(Y ) < E(X ) <

4. u is homoskedastic σ 2


5. u is distributed N(0,

If all five assumptions hold, then:

• β β

ˆ ˆ

and are normally distributed for all n (!)

0 1

• the t-statistic has a Student t distribution with n – 2 degrees of

freedom – this holds exactly for all n (!) 40




958.15 KB




+1 anno fa


Materiale didattico per il corso di Econometria applicata del prof. Roberto Golinelli. Trattasi di slides in lingua inglese a cura del docente, all'interno delle quali sono affrontati i seguenti argomenti: il modello di regressione lineare con un singolo regressore e l'intervallo di confidenza; Homoskedasticity ed Heteroskedasticity; il teorema di Gauss - Markov.

Corso di laurea: Corso di laurea in economia, mercati e istituzioni
Università: Bologna - Unibo
A.A.: 2011-2012

I contenuti di questa pagina costituiscono rielaborazioni personali del Publisher Atreyu di informazioni apprese con la frequenza delle lezioni di Econometria applicata e studio autonomo di eventuali libri di riferimento in preparazione dell'esame finale o della tesi. Non devono intendersi come materiale ufficiale dell'università Bologna - Unibo o del prof Golinelli Roberto.

Acquista con carta o conto PayPal

Scarica il file tutte le volte che vuoi

Paga con un conto PayPal per usufruire della garanzia Soddisfatto o rimborsato

Ti è piaciuto questo appunto? Valutalo!

Altri appunti di Econometria applicata

Econometria - Elementi
Riepilogo di concetti statistici
Regressione con variabili strumentali
Regressione multipla