vuoi
o PayPal
tutte le volte che vuoi
CAPM
Equilibrium assumption
- Open market
- All the risky assets refer to all the tradeable stocks available to all
- There is a risk-free asset (for borrowing and/or lending in unlimited quantities) with
interest rate r f
- All information is available to all such as covariances, variances, mean rates of return of stocks
and so on.
- Everyone is a risk-averse rational investor who uses the same nancial engineering
mean-variance portfolio theory from Markowitz.
→ Everyone has a portfolio on the same efficient frontier, and hence has a portfolio that is a
mixture of the risk-free asset and a unique efficient fund F (of risky assets). In other words,
everyone sets up the same optimization problem, does the same calculation, gets the same
answer and chooses a portfolio accordingly.
→This efficient fund used by all is called the market portfolio and is denoted by M. The fact
that it is the same for all leads us to conclude that it should be computable without using all
the optimization methods from Markowitz: The market has already reached an equilibrium so
that the weight for any asset in the market portfolio is given by its capital value (total worth
of its shares) divided by the total capital value of the whole market (all assets together).
CAPM on gretl
Time series data google_for_CAPM
β
The most important parameter is . This parameter measures the systematic risk of the portfolio.
Using the google and Nasdaq data, we will perform a simple regression with the returns of one specific
stock as the dependent variable, in this case we have Google’s returns over the constant and the market
returns
What we will use is the OLS,
Gretl: File > Open data > google_for_CAPM.csv
Model > Ordinary least square > Dependent variable: google return , independent: Nasdaq
return
We have 52 close price but only 51 observations
β = 1
.06419
Here we have , you will find this value under coefficient, this is also the beta of google
i
return, it implies that Google’s stock is slightly riskier than the overall market (beta>1)
Analysis > confidence intervals for coefficients
95% confidence interval means that 95% of the interval will contain the true parameter
The confidence interval for beta goes from 0.69 to 1.43, this means that the estimation for beta is not
precise because 0.69 means the asset’s aggregate risk is much smaller than that of the market and 1.4
means the asset’s aggregate risk is much higher than that of the market. The larger extreme of the
interval also exaggerate the variance of the model and increase the overall riskiness of the market.
The standard error, square root of the variance of the estimates. We have low sd for the constant and
large sd for beta.
Test of significance of the coefficient
Can we reject the null hypothesis that beta=0?
If you reject the H0 that beta=0 means that you accept the estimated coefficient and that variable has
some effects and can somehow explain the dependent variable and vice versa. Otherwise, you can
eliminate that variable from your model. (β−β )
H 0
t-ratio represent the statistic test we need, which is calculated by . this is the discrepancy
Standard error
measure. If the threshold level is greater than the t-ratio I reject the null hypothesis otherwise, I’ll accept
it.
We don’t see a threshold value here but we have the p-value. The p-value is the probability of getting a
α
)
number greater than the t-ratio, we then compare this to our significant level ( , if p-value is greater
α
than , we accept the null hypotheses, otherwise we can reject it.
We can see that the Nasdaq return’s p-value is smaller than its t-ratio, so we reject the null hypothesis
and accept Nasdaq return as a significant variable whose coefficient will never be 0. Of course, in real
life this number won’t be always be 1.06 because the two extreme of the confidence interval are very
different as explained above.
We have the mean of the dependent variable which represent the average return of Google’s stock
The sum squared error represents the difference between real returns of Google’s stock and their
estimated value. For the model to be a good fit, we need to minimize this value and as we see here it is
quite small.
R-squared of 0.41 means 41% of real values is captured by the model or 59% of the real values is
captured by the model.
F test is the test of co significant excluding the constant value but here we only have 1 independent
variable so the result won’t be different from that of the t-test, we are rejecting the null hypothesis
Standard error of the regression gives you an idea of the SD of the disturbances
If the disturbances has the same variance, our estimation is not precise anymore thanks to
heterokedascity. So to test this problem we can use White’s test
Test > Heterokedascity > White’s test
We can see that the dependent variable is the residuals so the question here is whether the variance of
the shocks depends on either of the listed independent variable. So, if the coefficients can jointly be 0
(our null hypothesis), then the variance of the shocks doesn’t depend on these variable and there is no
heterokedascity.
We can see that the p-value is high so we should accept the null hypothesis,
If we had heterokedascity, we would have received the robust parameter in the final model
if we are interested in the normality of the residual we can run the test Test> Normality of the residual
this test compare the %observation in our variable and the %observation that would be there if it were
a normal distribution
the p value is inside the bracket (0.10) and therefore is greater than our significant value, so we accept
the null hypothesis, the variable is normal
so our model is
Google return= 0.00097+1.064*Nasdaq return
Cross section data
Woodridge data on the determinants of wage,
We can see there are dummy variables, which only have two values which are represented by 0 and 1.
For example, non white=1; if the person is white, the variable takes value 0 otherwise, the variable take
value 1
These data are undated so we have a cross-section data
Question: Is wage affected by the parents’ education?
Model
lwage= meduc + feduc
Our dependent variable is log of wage
Our first independent variable is mother education and the second is father education
Sub-text:
Are the coefficients significantly different from 0? Looking at the p value, constant have a p value of 0
and t-ratio 0f 103, which lies far on the right of the threshold level and therefore we reject the null
hypothesis
Doing the same with other coefficients, we can see that they are all significantly different from 0
With this, we can read the final model as
l wage = 6
.39 + 0
.02meduc + 0
.02f educ
Another way to read this
3 stars mean that the estimate is significant at 1% statistical level)(extremely), which is extremely high- 2
stars mean their significance level is 5% and 3star is 10%,
How does this model perform overall?
F test has a value of 24 and a small p-value (6.42e-11) so we reject the null hypothesis and the model
holds
R squared is really small, only 6% of real values of lwage is captured by this model. There’s a huge
amount of unexplained variability. If we didn’t have lwage but wage as a dependent variable, what
would the coefficient stand for? It would be the derivative of wage in respect to meduc and feduc while
the constant value will be the wage at 0
The coefficients tell us how much the dependent variable increase/decrease ( in unit) when the
independent variable increase/decrease by 1 unit.
Id we have different type of regression, log of wage, log of feduc… it would represent the elasticity, so
the percentage change in the dependent variable for an unit change in the independent variable
The variables are all strongly significant
R squared is 0.25, which is not bad given that the cross section data has a lot of variability
F test, the coefficients is 44 and the pvalue is 1.16*10^-54, so you reject the null hypothesis
Wage is increasing in experience since the coefficient is .014,
Model
L
wage = 5
.39550 + 0.014 e
xper + 0.012 t
enure + 0.2 (
1if married|0 if not married)⏟
* * * you will get higher wage if youre marr
A dummy variable doesn’t have a numerical value, just assigned value
If you increase your exp by 1 year, your wage increase by 0.014 %. If you rmarried, your wage
increase by 0.2
If we don’t have a variable proxy for your skill, it maybe that the wage is hig not because your
education value is high but because you are more skilled. So if you want to know the return of
an education on your future wage, this is a bad way to do it
Lets use IQ as a proxy for skill, modify model ad see what happen