Che materia stai cercando?

Anteprima

ESTRATTO DOCUMENTO

25

Random variables: joint distributions and covariance

MULTIVARIATE:

• Random variables X and Y have a joint distribution

• The covariance between X and Y is

µ µ σ

cov(X,Y) = E[(X – )(Y – )] =

X Y XY

• linear association between X and

The covariance is a measure of the

Y; its units are units of X × units of Y

• cov(X,Y) > 0 means a positive relation between X and Y

• If X and Y are independently distributed, then cov(X,Y) = 0 (but not

vice versa!!)

• The covariance of a r.v. with itself is its variance:

µ µ µ σ

2 2

cov(X,X) = E[(X – )(X – )] = E[(X – ) ] =

X X X X 26

The covariance between Test Score and STR is negative:

so is the correlation… 27

The correlation coefficient is defined in terms of the covariance:

σ

cov( X , Z ) = XZ = r

corr(X,Z) = XZ

σ σ

var( X ) var( Z ) X Z

• ≤ ≤

1 corr(X,Z) 1

• corr(X,Z) = 1 mean perfect positive linear association

• corr(X,Z) = –1 means perfect negative linear association

• corr(X,Z) = 0 means no linear association

= −

ˆ

r 0 . 23

(nota: nel nostro campione di 420 distretti )

non vi fate fuorviare dai simboli: prima X e Y ora X e Z

The correlation coefficient measures linear association

In the plots below there is one case in which correlation exists,

but corr = 0 (what is? why?) 28

29

30

(c) Conditional distributions and conditional means

Conditional distributions

• The distribution of Y given value(s) of some other random variable X

• Example: the distribution of test scores, given that STR < 20

Conditional expectations and conditional moments

• conditional mean = mean of conditional distribution

= E(Y|X = x) (important concept and notation)

• conditional variance = variance of conditional distribution

• Example: E(Test scores | STR < 20) = the mean of test scores among

districts with small class sizes

The difference in means is the difference between the means of two

conditional distributions: 31

Conditional mean, ctd.

∆ = E(Test scores|STR < 20) – E(Test scores|STR 20)

Other examples of conditional means:

• Wages of all female workers (Y = wages, X = gender)

• Mortality rate of those given an experimental treatment (Y = live/die;

X = treated/not treated)

• If E(X|Z) = const, then corr(X,Z) = 0 (not necessarily vice versa

however)

The conditional mean is a (possibly new) term for the familiar idea of

the group mean ≥ → r ?

nota: se E(Y | X < 20) = E(Y | X 20) Cov(Y, X)=0; e YX 32

(d) Distribution of a sample of data drawn randomly

from a population: Y ,…, Y

1 n

We will assume simple random sampling

• Choose and individual (district, entity) at random from the

population

Randomness and data

• Prior to sample selection, the value of Y is random because the

individual selected is random

• Once the individual is selected and the value of Y is observed, then Y

is just a number – not random

• th

The data set is (Y , Y ,…, Y ), where Y = value of Y for the i

1 2 n i

individual (district, entity) sampled 33

Distribution of Y ,…, Y under simple random sampling

1 n

• Because individuals #1 and #2 are selected at random, the value of

has no information content for Y . Thus:

Y 1 2

Y and Y are independently distributed

o 1 2

Y and Y come from the same distribution, that is, Y , Y are

o 1 2 1 2

identically distributed

That is, under simple random sampling, Y and Y are

o 1 2

independently and identically distributed (i.i.d.). }, i = 1,…, n,

More generally, under simple random sampling, {Y

o i

are i.i.d.

This framework allows rigorous statistical inferences about moments

of population distributions using a sample of data from that

population 34

35

1. The probability framework for statistical inference

2. Estimation

3. Testing

4. Confidence Intervals 36

Estimation

is the natural estimator of the mean. But:

Y ?

(a) What are the properties of Y

(b) Why should we use rather than some other estimator?

Y

• (the first observation)

Y 1

• maybe unequal weights – not simple average

• median(Y ,…, Y )

1 n

The starting point is the sampling distribution of Y , defined as

n

∑ Y

i n

1 ∑

= =

=

i 1

Y Y

i

n n =

i 1 37

(a) The sampling distribution of Y the

Y is a random variable, and its properties are determined by

sampling distribution of Y

• The individuals in the sample are drawn at random.

• Thus the values of (Y ,…, Y ) are random

1 n

• ,…, Y ), such as , are random: had a different

Thus functions of (Y Y

1 n

sample been drawn, they would have taken on a different value

• over different possible samples of size n is

The distribution of Y

called the sampling distribution of .

Y

• are the mean and variance of its

The mean and variance of Y

) and var( ); i.e. its first two moments.

sampling distribution, E(

Y Y

• The concept of the sampling distribution underpins all of

econometrics. 38

Things we want to know about the sampling distribution:

• What is the mean of ?

Y

µ µ

) = , then is an unbiased estimator of

If E(

Y Y

o

• What is the variance of ?

Y

) depend on n (famous 1/n formula)

How does var(

Y

o µ

• Does become close to when n is large?

Y µ

is a consistent estimator of

Law of large numbers: Y

o µ

• – appears bell shaped for n large…is this generally true?

Y µ

In fact, – is approximately normally distributed for n large

Y

o (Central Limit Theorem) 39

The mean and variance of the sampling distribution of Y

i.i.d. from any distribution:

General case – that is, for Y i

n n n

1 1 1

∑ ∑ ∑ µ

µ

mean: E( ) = E( ) = = =

Y Y E (

Y ) Y

i i Y

n n n

= = =

i 1 i 1 i 1

2

Variance: var( ) = E[ – E( )]

Y Y Y

µ 2

– ]

= E[

Y Y 2

 

 

n

1 ∑ µ

= E Y

 

 

i Y

 

n

 

=

i 1 2

 

n

1 ∑ µ

= E (

Y )

 

i Y

 

n =

1

i 40

 

2

 

n

1 ∑ µ

= −

 

 

E ( Y )

i Y

2   

 n

 

=

i 1 2

 

n

1 ∑ µ

= −

 

E ( Y )

i Y

2  

n =

i 1

n

1 ∑ µ

= − 2

E ( Y )

i Y

2

n =

i 1

n

1 1

∑ σ σ

= =

2 2

n

2 2

n n

Y Y

=

i 1

σ 2

Y

= n 41

Mean and variance of sampling distribution of , ctd.

Y

µ

E( ) =

Y Y

σ 2

Y

var( ) =

Y n

Implications: µ µ

1. is an unbiased estimator of (that is, E( ) = )

Y Y

Y Y

2. var( ) is inversely proportional to n

Y

• the spread of the sampling distribution is proportional to

1/ n

• Thus the sampling uncertainty associated with is

Y

(larger samples, less uncertainty, but

proportional to 1/ n

square-root law) 42

The sampling distribution of Y when n is large

For small sample sizes, the distribution of is complicated, but if n is

Y

large, the sampling distribution is simple!

becomes more tightly centered

1. As n increases, the distribution of Y

µ (the Law of Large Numbers)

around Y µ

Moreover, the distribution of Y

2. – becomes normal (the Central

Y

Limit Theorem) 43

The Law of Large Numbers:

An estimator is consistent if the probability that its falls within an

interval of the true population value tends to one as the sample size

increases. µ

σ ∞,

2

If (Y ,…,Y ) are i.i.d. and < then is a consistent estimator of ,

Y

1 n Y

Y

that is, µ ε → → ∞

– | < ] 1 as n

Pr[|

Y Y p µ

which can be written, Y Y

p µ µ

→ ” means “ converges in probability to ”).

(“

Y Y

Y Y

the math: σ 2 µ ε

→ ∞ → →

Y

as n , var( ) = 0, which implies that Pr[| – | < ] 1.

Y

Y Y

n 44

The Central Limit Theorem (CLT):

σ ∞,

2

,…,Y ) are i.i.d. and 0 < < then when n is large the

If (Y 1 n Y

distribution of is well approximated by a normal distribution.

Y σ 2

µ

• Y

is approximately distributed N( , ) (“normal distribution with

Y Y n

µ σ 2

and variance /n”)

mean Y Y

µ σ

• ( – )/ is approximately distributed N(0,1) (standard

n Y Y Y

normal) µ

− −

Y E Y Y

( )

• Y

= is approximately

=

That is, [standardized

Y ] σ n

/

Y

var( ) Y

distributed as N(0,1)

• The larger is n, the better is the approximation. 45

Summary: The Sampling Distribution of Y

σ ∞,

2

For Y ,…,Y i.i.d. with 0 < <

1 n Y µ

• has mean

The exact (finite sample) sampling distribution of Y Y

µ σ 2

(“ is an unbiased estimator of ”) and variance /n

Y Y Y

• is

Other than its mean and variance, the exact distribution of Y

complicated and depends on the distribution of Y (the population

distribution)

• When n is large, the sampling distribution simplifies:

p µ

→ (Law of large numbers)

Y

o Y

Y E (

Y ) is approximately N(0,1) (CLT)

o var(

Y ) 46

µ

(b) Why Use To Estimate ?

Y Y

µ

• unbiased: E( ) =

Y Y

is Y

p µ

• →

consistent:

Y Y

is Y µ

• is the “least squares” estimator of ; solves,

Y Y

Y

n

∑ − 2

min (

Y m )

m i

=

i 1

minimizes the sum of squared deviations (“residuals”)

so, Y

optional derivation (also see App. 3.2)

n n n

d d

∑ ∑ ∑

− − −

2 2

= =

(

Y m ) (

Y m ) 2 (

Y m )

i i i

dm dm

= = =

i 1 i 1 i 1

Set derivative to zero and denote optimal value of m by :

n n n

1

∑ ∑ ∑

ˆ ˆ

= = or = =

Y m nm m̂ Y Y

i

n

= = =

i 1 i 1 i 1 47

µ

Why Use To Estimate , ctd.

Y Y

• has a smaller variance than all other linear unbiased estimators:

Y n

1 ∑

µ µ

=

ˆ ˆ

, where {a } are such that is

consider the estimator, a Y i

Y i i Y

n =

i 1

µ

≤ ˆ

unbiased; then var( ) var( ) (proof: SW, Ch. 17)

Y Y

µ

• isn’t the only estimator of – can you think of a time you might

Y Y

want to use the median instead? 48

1. The probability framework for statistical inference

2. Estimation

3. Hypothesis Testing

4. Confidence intervals

Hypothesis Testing

The hypothesis testing problem (for the mean): make a provisional

decision, based on the evidence at hand, whether a null hypothesis is

true, or instead that some alternative hypothesis is true. That is, test

µ µ

: E(Y) = vs. H : E(Y) > (1-sided, >)

H 0 Y,0 1 Y,0

µ µ

H : E(Y) = vs. H : E(Y) < (1-sided, <)

0 Y,0 1 Y,0

µ µ

H : E(Y) = vs. H : E(Y) (2-sided)

0 Y,0 1 Y,0

Some terminology for testing statistical hypotheses: 49

50

p-value = probability of drawing a statistic (e.g. ) at least as adverse to

Y

the null as the value actually computed with your data, assuming that

the null hypothesis is true. Probability of type I error: “la probabilità di

commetere un errore rifiutando l’ipotesi nulla quando è vera”

The significance level of a test is a pre-specified probability of

incorrectly rejecting the null, when the null is true.

Calculating the p-value based on :

Y

µ µ

− > −

act

p-value = Pr [| Y | | Y |]

,0 ,0

H Y Y

0

act

where is the value of actually observed (nonrandom)

Y Y 51

Calculating the p-value, ctd.

• To compute the p-value, you need the to know the sampling

, which is complicated if n is small.

distribution of Y

• If n is large, you can use the normal approximation (CLT):

µ µ

− > −

act ,

p-value = Pr [| Y | | Y |]

H Y Y

,0 ,0

0 µ µ

− −

act

Y Y

>

,0 ,0

Y Y

= Pr [| | | |]

σ σ

H 0 / n / n

Y Y

µ µ

− −

act

Y Y

>

,0 ,0

Y Y

= Pr [| | | |]

σ σ

H 0 Y Y

= probability under left+right N(0,1) tails

σ σ

= std. dev. of the distribution of

where Y n

= / .

Y

Y 52

σ known:

Calculating the p-value with Y

• For large n, p-value = the probability that a N(0,1) random variable

µ σ

act – )/ |

falls outside |(

Y Y,0 Y

σ

• is unknown – it must be estimated

In practice, Y 53

σ

Estimator of the variance of Y, if is unknown:

Y

n

1 ∑ −

2 2

= = “sample variance of Y”

s (

Y Y )

− i

Y n 1 =

i 1

Fact: p σ

∞, →

4 2 2

If (Y ,…,Y ) are i.i.d. and E(Y ) < then s

1 n Y

Y

Why does the law of large numbers apply?

• 2

Because s is a sample average; see Appendix 3.3

Y

• ∞

4

Technical note: we assume E(Y ) < because here the average is not

, but of its square; see App. 3.3

of Y i 54


PAGINE

34

PESO

2.78 MB

AUTORE

Atreyu

PUBBLICATO

+1 anno fa


DESCRIZIONE DISPENSA

Materiale didattico per il corso di Econometria applicata del prof. Roberto Golinelli, all'interno del quale sono affrontati i seguenti argomenti: tipologie di dati, cross - sectional, time series e panel data; elementi di probabilità e statistica; elementi di teoria statistica; l'ipotesi statistica; la distribuzione e l'intervallo di confidenza.


DETTAGLI
Corso di laurea: Corso di laurea in economia, mercati e istituzioni
SSD:
Università: Bologna - Unibo
A.A.: 2011-2012

I contenuti di questa pagina costituiscono rielaborazioni personali del Publisher Atreyu di informazioni apprese con la frequenza delle lezioni di Econometria applicata e studio autonomo di eventuali libri di riferimento in preparazione dell'esame finale o della tesi. Non devono intendersi come materiale ufficiale dell'università Bologna - Unibo o del prof Golinelli Roberto.

Acquista con carta o conto PayPal

Scarica il file tutte le volte che vuoi

Paga con un conto PayPal per usufruire della garanzia Soddisfatto o rimborsato

Recensioni
Ti è piaciuto questo appunto? Valutalo!

Altri appunti di Econometria applicata

Regressione con variabili strumentali
Dispensa
Riepilogo di concetti statistici
Dispensa
Regressione Forecasting e Time Series
Dispensa
Regressione multipla
Dispensa