Che materia stai cercando?

Anteprima

ESTRATTO DOCUMENTO

Example: The California class size data

ˆ

tes t scr = 698.9 – 2.28×STR

(1) (10.4) (0.52)

ˆ

tes t scr = 686.0 – 1.10×STR – 0.650×el_pct

(2) (8.7) (0.43) (0.031)

• The coefficient on STR in (2) is the effect on TestScores of a unit

change in STR, holding constant the percentage of English Learners in

the district

• The coefficient on STR falls by one-half

• The 95% confidence interval for coefficient on STR in (2) is {–1.10 ±

1.96×0.43} = (–1.95, –0.26)

β

The t-statistic testing = 0 is t = –1.10/0.43 = –2.54, so we reject the

STR

hypothesis at the 5% significance level 3

Tests of Joint Hypotheses (SW Section 7.2)

Let Expn = expenditures per pupil (in thousand $, therefore Expn =

expn_stu/1000) and consider the population regression model:

β β β β

= + STR + Expn + el_pct + u

testscr

i 0 1 i 2 i 3 i i

The null hypothesis that “school resources don’t matter,” and the

alternative that they do, corresponds to:

β β

H : = 0 and = 0

0 1 2

β β

≠ ≠

: either 0 or 0 or both

vs. H 1 1 2

β β β β

TestScore = + STR + Expn + el_pct + u

i 0 1 i 2 i 3 i i 4

The F-statistic

The F-statistic tests all parts of a joint hypothesis at once.

β β β

= and =

Formula for the special case of the joint hypothesis 1 1,0 2

β in a regression with two regressors:

2,0 ρ

+ −

 

2 2 ˆ

2

t t t t

1 1 2 t ,

t 1 2

 

F = 1 2

 

ρ

− 2

ˆ

2 1

 

t t

,

1 2

ρ

ˆ

where estimates the correlation between t t

and .

1 2

t ,

t

1 2 F

Reject when is large (how large?) 5

β β

The F-statistic testing and :

1 2 ρ

+ −

 

2 2 ˆ

2

t t t t

1 1 2 t ,

t 1 2

 

F = 1 2

 

ρ

− 2

ˆ

2 1

 

t t

,

1 2

• and/or t is large

The F-statistic is large when t

1 2

• The F-statistic corrects (in just the right way) for the correlation

and t .

between t

1 2 β

• The formula for more than two ’s is nasty unless you use matrix

algebra.

• This gives the F-statistic a nice large-sample approximate

distribution, which is… 6

Large-sample distribution of the F-statistic p

ρ →

ˆ

and t are independent, so 0; in

Consider special case that t

1 2 t ,

t

1 2

large samples the formula becomes

ρ

+ −

 

2 2 ˆ

t t t t

2 1

1 +

1 2 t ,

t 1 2 2 2

 

F = = ( t t )

1 2

 

ρ

− 1 2

2

ˆ 2

2 1

 

t t

,

1 2

• t t

Under the null, and have standard normal distributions that, in

1 2

this special case, are independent F -statistic is the

The large-sample distribution of the

distribution of the average of two independently distributed

(chi-squared)

squared standard normal random variables 7

More on F-statistics:

a simple F-statistic formula that is easy to understand (it is only valid if

the errors are homoskedastic, but it might help intuition).

The homoskedasticity-only F-statistic

When the errors are homoskedastic, there is a simple formula for

computing the “homoskedasticity-only” F-statistic:

• Run two regressions, one under the null hypothesis (the “restricted”

regression) and one under the alternative hypothesis (the

“unrestricted” regression).

• 2

’s – if the “unrestricted”

Compare the fits of the regressions – the R

model fits sufficiently better, reject the null 8

The “restricted” and “unrestricted” regressions

Example: are the coefficients on STR and Expn zero?

Unrestricted population regression (under H ):

1

β β β β

= + STR + Expn + el_pct + u

testscr

i 0 1 i 2 i 3 i i

Restricted population regression (that is, under H ):

0

β β

= + el_pct + u (why?)

testscr

i 0 3 i i

• The number of restrictions under H is q = 2 (why?).

0

• 2 will be higher) in the unrestricted regression

The fit will be better (R

(why?) 2 increase for the coefficients on Expn and

By how much must the R

el_pct to be judged statistically significant? 9

Simple formula for the homoskedasticity-only F-statistic:

2 2

( R R )/ q

unrestricted restricted

F = − − −

2

(1 ) /( 1)

R n k

unrestricted unrestricted

where: 2

2 = the for the restricted regression

R R

restricted 2

2 = the for the unrestricted regression

R R

unrestricted

= the number of restrictions under the null

q = the number of regressors in the

k

unrestricted unrestricted regression.

• The bigger the difference between the restricted and unrestricted

2

’s – the greater the improvement in fit by adding the variables in

R

question – the larger is the homoskedasticity-only .

F 10

Example:

Restricted regression:

ˆ 2

tes t scr = 644.7 –0.671 el_pct = 0.4149

R

restricted

(1.0) (0.032)

Unrestricted regression:

ˆ

tes t scr = 649.6 – 0.29 + 3.87 – 0.656

STR Expn el_pct

(15.5) (0.48) (1.59) (0.032)

2 = 0.4366, = 3, = 2

R k q

unrestricted

unrestricted −

2 2

( R R )/ q

unrestricted restricted

=

so F − − −

2

R n k

(1 ) /( 1)

unrestricted unrestricted

(.4366 .4149) / 2

= = 8.01

− − −

(1 .4366) /(420 3 1) 5.43…

Note: Heteroskedasticity-robust F = 11

The homoskedasticity-only F-statistic – summary

2 2

( R R )/ q

unrestricted restricted

F = − − −

2

R n k

(1 ) /( 1)

unrestricted unrestricted

The homoskedasticity-only F-statistic rejects when adding the two

2 by “enough” – that is, when adding the two

variables increased the R

variables improves the fit of the regression by “enough” 2

s

Alternatively, the F-test can be expressed in terms of SSRs instead of R

SSR SSR

restricted unrestrict ed

q

=

F SSR

unrestrict ed

− −

n k 1

unrestrict ed 12

Digression: The F distribution

Your regression printouts might refer to the “F” distribution.

If the four multiple regression LS assumptions hold and:

is homoskedastic, i.e. var(u|X ,…,X ) does not depend on X’s

5. u i 1 k

6. u ,…,u are normally distributed

1 n

then the homoskedasticity-only F-statistic has the

” distribution, where q = the number of restrictions and k = the

“F q,n-k–1

number of regressors under the alternative (the unrestricted model). 13

The F distribution:

q,n–k–1

• The F distribution is tabulated many places χ

• → ∞, 2

distribution asymptotes to the /q

As n the F q,n-k–1 q

distribution: χ

• 2

For q not too big and n≥100, the F distribution and the /q

q,n–k–1 q

distribution are essentially identical.

• Many regression packages (including GRETL) compute p-values of

F-statistics using the F distribution

• You will encounter the F distribution in published empirical work. 14

Another digression: A little history of statistics…

• The theory of the homoskedasticity-only F-statistic and the F q,n–k–1

distributions rests on implausibly strong assumptions (are earnings

normally distributed?)

• th

These statistics dates to the early 20 century… back in the days

when data sets were small and computers were people…

• The F-statistic and F distribution were major breakthroughs: an

q,n–k–1

easily computed formula; a single set of tables that could be

published once, then applied in many settings; and a precise,

mathematically elegant justification. 15

A little history of statistics, ctd…

• The strong assumptions seemed a minor price for this breakthrough.

• But with modern computers and large samples we can use the

distribution, which

heteroskedasticity-robust F-statistic and the F q,∞

only require the four least squares assumptions (not assumptions #5

and #6)

• This historical legacy persists in modern software, in which

homoskedasticity-only standard errors (and F-statistics) are the

default, and in which p-values are computed using the F

q,n–k–1

distribution. 16

Summary: the homoskedasticity-only F-statistic and the F

distribution

• These are justified only under very strong conditions – stronger than

are realistic in practice.

• Yet, they are widely used. χ

• 2

You should use the heteroskedasticity-robust F-statistic, with /q

q

) critical values.

(that is, F q,∞ χ

• 2 /q distribution.

For n 100, the F-distribution essentially is the

≥ q

• For small n, sometimes researchers use the F distribution because it

has larger critical values and in this sense is more conservative. 17

Summary: testing joint hypotheses

• The “one at a time” approach of rejecting if either of the t-statistics

exceeds 1.96 rejects more than 5% of the time under the null (the

size exceeds the desired significance level)

• The heteroskedasticity-robust F-statistic is built in to GRETL (“test”

command); this tests all q restrictions at once.

χ

• 2 /q (= F )

For n large, the F-statistic is distributed q,∞

q

• The homoskedasticity-only F-statistic is important historically (and

thus in practice), and can help intuition, but isn’t valid when there is

heteroskedasticity 18


PAGINE

18

PESO

369.33 KB

AUTORE

Atreyu

PUBBLICATO

+1 anno fa


DESCRIZIONE DISPENSA

Materiale didattico per il corso di Econometria applicata del prof. Roberto Golinelli. Trattasi di slides in lingua inglese a cura del docente, all'interno delle quali sono affrontati i seguenti argomenti: verifica empirica ed intervalli di confidenza del modello di regressione lineare multipla; la digressione; l'omostaticità e l'eterostaticità; restrizione singola dei coefficienti multipli.


DETTAGLI
Corso di laurea: Corso di laurea in economia, mercati e istituzioni
SSD:
Università: Bologna - Unibo
A.A.: 2011-2012

I contenuti di questa pagina costituiscono rielaborazioni personali del Publisher Atreyu di informazioni apprese con la frequenza delle lezioni di Econometria applicata e studio autonomo di eventuali libri di riferimento in preparazione dell'esame finale o della tesi. Non devono intendersi come materiale ufficiale dell'università Bologna - Unibo o del prof Golinelli Roberto.

Acquista con carta o conto PayPal

Scarica il file tutte le volte che vuoi

Paga con un conto PayPal per usufruire della garanzia Soddisfatto o rimborsato

Recensioni
Ti è piaciuto questo appunto? Valutalo!

Altri appunti di Econometria applicata

Econometria - Elementi
Dispensa
Riepilogo di concetti statistici
Dispensa
Regressione con variabili strumentali
Dispensa
Regressione multipla
Dispensa