vuoi
o PayPal
tutte le volte che vuoi
R - Plots & Proofs (Simulating Consistency)
4Point Estimation of the Mean (Normal and not Normal IID Samples)
- The estimator
- Expected value of the estimator
- Variance of the estimator
- Risk of the estimator
- Asymptotic normality
Point Estimation of the Variance (Known Mean and Normal IID Samples)
- The estimator
- Expected value of the estimator
- 73. Variance of the estimator
- 74. Distribution of the estimator
- 85. Risk of the estimator
Point Estimation of the Variance (Unknown Mean and Normal IID Samples)
- The estimator
- Expected value of the estimator
- Variance of the estimator
- Distribution of the estimator
- Risk of the estimator
Point estimation is the act of choosing a
parameter that is our best guess of the true (and unknown)∈θ̂ ξparameter . Our best guess is called an estimate of .θ θ̂ θ0 0
Evaluation of an Estimator (Loss Function)Making an estimate is an act that produces some consequences. Among the consequences that are usuallyθ̂considered in a parametric decision problem the most relevant one is the estimation error. The estimationis the difference between the estimate and the true parameter :error e θ̂ θ 0= −e θ̂ θ 0
Of course, the statistican’s goal is to commit the smallest possible estimation error. This preference canbe formalized using loss functions. A ), mapping Θ x Θ into quantifies the lossloss function L(θ̂, θ R,0incured by estimating with Frequently used loss functions are:θ θ̂.0 The absolute error : ) = || − ||L(θ̂, θ θ̂ θ0 0
The squared error : ) = 2|| − ||L(θ̂, θ
θ̂ θ0 The expected value of a loss function is called the statistical risk of the estimator and is denoted by: R(θ̂) = E[θ̂ - θ0]• When the absolute error is used as a loss function, then the risk = E[|θ̂ - θ0|] is called the mean absolute error (MAE)
• When the squared error is used as a loss function, then the risk = E[(θ̂ - θ0)2] is called the mean squared error (MSE). The square root of the mean squared error is called root mean squared error (RMSE).
1. Unbiasedness
If an estimator produces parameter estimates that are on average correct, then it is said to be unbiased.
Let θ be the true parameter and let θ̂ be an estimator. θ̂ is an unbiased estimator of θ if and only if:
E[θ̂ - θ0] = 0
Also note that if an estimator is unbiased, this implies that the estimation
error is on average zero: hi hi = = = 0- - -E[e] E θ̂ θ E θ̂ θ θ θ0 0 0 0
EfficiencyThe is a measure of quality of an estimator which essentially means that an estimator is defined efficiency as efficient if it has a small variance or mean squared error, indicating that there is a small deviance between the estimated value and the true value. If θ̂ and θ̂ are two unbiased estimators for the same parameter θ, then the variance can be compared to determine performance. θ̂ is more efficient than θ̂ if the variance of θ is smaller than the variance of θ̂) V ar(T) > V ar(T1) for all the values of θ 0
R - Plots & Proofs (Simulating Efficiency)
set.seed(1) matrix(NA,M <- 100000; n <- 100; mat.y <- nrow = M, ncol = 2); for(i in 1:M){ rnorm(n,y = 5); c(mean(y), median(y)); mat.y[i,] <- }; plot(density(mat.y[, 1]), type = "l", main = ""); lines(density(mat.y[, 2]), col = 2);
Density 210 4.6
- 4.8 5.0 5.2 5.4
- N = 100000
- Bandwidth = 0.008989
- For a normal random sample, both the sample mean and sample median are consistent estimators of µ. The mean is more efficient.
- Consistency
- A is an estimator having the property that as the number of data points used increases indefinitely, the resulting sequence of estimates converges in probability to θ. This means that the distribution of the estimates become more and more concentrated near the true value of the parameter being estimated, so that the probability of the estimator being arbitrarily close converges to one. A sequence of estimators (ξ) is said to be consistent if and only if: θ̂n → θ as n → ∞ where → indicates convergence in probability. The sequence of estimators is said to be strongly consistent if and only if: θ̂n → θ as n → ∞ almost surely, where → indicates almost sure convergence.
convergence.−−→R - Plots & Proofs (Simulating Consistency)
require(MASS);
rep(NA,M <- 100000; n1 <- 20; n2 <- 200; y1 <- y2 <- M);
for(i in 1:M){ mean(rpois(n1,y1[i] <- 1));mean(rpois(n2,y2[i] <- 1));};
par(mfrow c(1,2));
=hist.scott(y1, c(0, abline(vxlim = 2), main = "", xlab = ""); = 1, col = 2);
hist.scott(y2, c(0, abline(vxlim = 2), main = "", xlab = ""); = 1, col = 2);
64 543Density Density 32 21 10 00.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
4Point Estimation of the Mean (Normal and not Normal IID Samples)
The sample is made of independent draws from a probability distribution having unknown meanξ nnand . Specifically, we observe realizations of independent random2 {x }known varianceµ σ n , . . . , x nn1variables all having the same distribution with and .2{X } unknown mean known variance, . . . , X µ σn1The sample is the n-dimensional vector = [x ].ξ , . . . , xn n11. The
estimator ˆAs an estimator of the mean we use the sample mean :µ, Xnn1 X=X̂ X .n in i=12. Expected value of the estimatorˆThe expected value of the estimator is equal to the true meanX µ:n ] =E[ X̂ µnThis can be proved using linearity of the expected value: #" n1ˆ X] = XE[ X E in n i=1n1 X ]= E[Xin i=1n1 1X= = µ nµ µn ni=1ˆTherefore, the estimator of is unbiased.Xn3. Variance of the estimatorˆThe variance of the estimator is:Xn 2σˆ ] =V ar[ Xn nThis can be proved using the formula for the variance of an independent sum:5" #n1ˆ X] =V ar[ X V ar Xn in i=1" #n1 X= V ar Xi2n i=1n1 X ]= V ar[Xi2n i=1n1 1 2σX == = 22 nσσ2 2n n ni=1Therefore, the variance of the estimator tends to zero as the sample size tends to infinity.n4. Risk of the estimatorThe mean squared error of the estimator is: 2σˆ ˆ) = ] =M SE(X V ar[ Xn n nThis is proved as follows: ˆ ˆh i)
= 2|| -M SE( X E X µ||n nˆh i= 2| -E X µ|nˆh i= ( 2-E X µ)nˆ= ]V ar[ Xn2σ= n5. Asymptotic normality
The sequence satisfies the conditions of Lindeberg-Levy Central Limit Theorem (i.e. is an IID{X } {X }n nsequence with finite mean and variance). Therefore, the sample mean {X }n√ -X µ dn −→n Zσdwhere is a standard normal random variable and denotes convergence in distribution. In other word,−→Z 2the sample mean converges in distribution to a normal random variable with mean and variance .σX µn n6
Point Estimation of the Variance (Known Mean and Normal IID Samples)
The sample is made of independent draws from a normal distribution having andknown meanξ n µn . Specifically, we observe realizations of independent random2 {x }unknown variance σ n , . . . , x nn1variables all having a normal distribution with and .2{X } known mean unknown variance, . . . , X
µ σn1The sample is the n-dimensional vector = [x ].ξ , . . . , xn n11. The estimatorWe use the following estimator of variance: n1 X(X=2 2−σ̂ µ)in n i=12. Expected value of the estimatorThe expected value of the estimator is equal to the true variance :2 2σ̂ σn =2 2 E σ̂ σnThis can be proved using the linearity of the expected value: #