Anteprima
Vedrai una selezione di 1 pagina su 5
Prove d'esame prof Bontempi - Econometria Pag. 1
1 su 5
D/illustrazione/soddisfatti o rimborsati
Disdici quando
vuoi
Acquista con carta
o PayPal
Scarica i documenti
tutte le volte che vuoi
Estratto del documento

The adjusted R squared is low in both the simple and the multiple regression model, in particular it

decreases from the first to the second. This fact suggests that all the demographic variables

together with income are not useful in explaining Y (97% is unexplained!)

2 possibile risposta alla multicollinearity:

From the correlation between all the explanatory variables, we are able to see that between

income and education there is a high (positive) correlation equal to 46%. We can test these two

variables jointly through the F test in order to verify the presence of multicollinearity. The pvalue of

the F test is lower that 5% saying that we must reject H0 in favor of H1. This result is not

consistent with the t test, which suggest the opposite.

So, there is multicollinearity between income and education: we must cut one of this two in order

to study the regression. 2

For what we have said before, the best model between 1 and 2 seems to be the simple

3. one.

Indeed, in the multiple regression model we have multicollinearity and all the variables are not

statistically different from zero.

In general, both the simple and the multiple are poor models for what we have said about R

squared adjusted.

PROVA 2

Answers.

We are dealing with cross sectional data. In particular, the dataset contains information

1. about USA working individuals for 1987. From the command descr, we obtain information

about our variables: experience (years of full-time work experience), male (1 if male, 0

otherwise), school (years of schooling) and wage (wage $ per hour of 1980). The number of

observations is equal to 3294.

Through the command sum y, d we obtain some more information about the dependent variable.

First we see that it has not a normal distribution: the skewness is different from zero and the

kurtosis is higher than three. As the mean of y is not so high with respect to the median we can set

up a sktest in order to be sure about our previous statement: the test shows that the p value is

equal to zero, saying that we were right. In particular, the skewness is equal to 1.97; this suggests

that most of the values are located under the right tail of the distribution.( positive outliers)

A kurtosis higher than three (12.63) suggests that the variable can be well approximated by a t

student.

Moreover, from the correlation command in STATA we can see that all the explanatory variables

are positively correlated.

The coefficient of school is 0.56. This means that one extra year of schooling implies an

2. increase in wage equal to 0.56 dollars. As regards the value of the intercept, it has not

sense since it is not credible that someone in the sample has a level of schooling equal to

zero.

The coefficient of schooling in this case is equal to 0.1052. So, one extra year of schooling

3. implies an increase in wage equal to 10.52%.

Now we have two different dependent variables: wage and log(wage). So, if we want to compare

the first and the second regression we need to see what is the relationship between the root MSE

and the sd of Y of these two models.

In the first model we have 1.042 (SD of y/root MSE) while in the second 1.04079. Consequently,

the second model is better than the first in explaining y.

In our new regression, all the coefficients are statistically different from zero (pvalue =0). In

4. particular, the dummy variable male tells us that ceteris paribus there is a difference in

terms of wage between men and women which is equal to 1.344. (men gain more than

women)

Note that there is a negative correlation between school and experience ( -20%), indeed the

coefficient of schooling has increased (from 0.56 to 0.64) and, as the coefficient of experience is

positive, in the simple regression model we have a downward bias in the estimator. 3

Moreover, the confidence interval of schooling in the multiple regression model does not contain

0.56 saying that the bias is significant. Consequently, we prefer the multiple regression model.

When we add exper_sq, its pvalue together with the experience pvalue become higher than Alfa.

So, they are not statistically significant. The marginal effect of experience on wage can be defined

by using the derivative:

Variation of wage/wage= (beta +2*beta (exper_sq))* variation of exper

1 2

PROVA 3

Answers.

The data set is a cross sectional data: we have data of 1990 about countries and their life

1. expectancy and gdp. Through the command descr on stata we find out the number of

observations which is equal to 124 as well as a description of our explanatory variables:

• Country = Indicator of country

• cid= Country label

• lexp= Life expectancy in years

• gdp= Per capita GDP at parity of purchasing power

Through the scatterplot we can see that between the life expectancy and the gdp there is a

positive relationship:

90

80

70

60

50

40 0 5000 10000 15000 20000

gdp

lexp Fitted values

This means that an increase in the gdp implies an increase of the life expectancy. From the graph

there is also evidence of heteroskedasticy standard error. Indeed, for different level of x we have a

different variance of our errors. It seems that for small explanatory variables we have larger

variance while for big explanatory variables smaller variance: for low levels of gdp we have more

volatility of y.

Through the command sum lexp, d we can find some information about our dependent variable

distribution. In particular we can see that the skewness is different from zero ( but near: 0.2) and

the kurtosis is different from three (1). Consequently, from the skewness we can state that our

distribution is enough symmetric and from the kurtosis that it has very slim tails. We can make up

a test in order to be surer about our previous statements. The command is sktest. The p value is

equal to zero, so with a significance level equal to 5% we reject H0 in favour of H1. Our

distribution is not a normal distribution. 4

Dettagli
Publisher
A.A. 2016-2017
5 pagine
1 download
SSD Scienze economiche e statistiche SECS-P/05 Econometria

I contenuti di questa pagina costituiscono rielaborazioni personali del Publisher .Giulia11 di informazioni apprese con la frequenza delle lezioni di Econometrics e studio autonomo di eventuali libri di riferimento in preparazione dell'esame finale o della tesi. Non devono intendersi come materiale ufficiale dell'università Università degli Studi di Bologna o del prof Bontempi Maria Elena.