Stima Puntuale (a Handbook for Data Scientists with examples in R)

Appunti di elementi di statistica basati su appunti personali del publisher presi alle lezioni della prof. Sambucini, dell’università degli Studi di Roma - La Sapienza, facoltà di Ingegneria dell'informazione, informatica e statistica, Corso di laurea in statistica gestionale. Scarica il file in formato PDF!

Esame Elementi di Statistica

Facoltà Scienze statistiche

Dal corso del Prof. Sambucini Valeria

Università Università degli Studi di Roma La Sapienza

Publisher victorplesco

A.A. 2019-2020

10 pagine

Appunto

Vota

Scarica

Estratto del documento

R - Plots & Proofs (Simulating Consistency)

4Point Estimation of the Mean (Normal and not Normal IID Samples)

The estimator
Expected value of the estimator
Variance of the estimator
Risk of the estimator
Asymptotic normality

Point Estimation of the Variance (Known Mean and Normal IID Samples)

The estimator
Expected value of the estimator

73. Variance of the estimator
74. Distribution of the estimator
85. Risk of the estimator

Point Estimation of the Variance (Unknown Mean and Normal IID Samples)

The estimator
Expected value of the estimator
Variance of the estimator
Distribution of the estimator
Risk of the estimator

Point estimation is the act of choosing a

parameter that is our best guess of the true (and unknown)∈θ̂ ξparameter . Our best guess is called an estimate of .θ θ̂ θ0 0

Evaluation of an Estimator (Loss Function)Making an estimate is an act that produces some consequences. Among the consequences that are usuallyθ̂considered in a parametric decision problem the most relevant one is the estimation error. The estimationis the difference between the estimate and the true parameter :error e θ̂ θ 0= −e θ̂ θ 0

Of course, the statistican’s goal is to commit the smallest possible estimation error. This preference canbe formalized using loss functions. A ), mapping Θ x Θ into quantifies the lossloss function L(θ̂, θ R,0incured by estimating with Frequently used loss functions are:θ θ̂.0 The absolute error : ) = || − ||L(θ̂, θ θ̂ θ0 0

The squared error : ) = 2|| − ||L(θ̂, θ

θ̂ θ₀ The expected value of a loss function is called the statistical risk of the estimator and is denoted by: R(θ̂) = E[θ̂ - θ₀]
• When the absolute error is used as a loss function, then the risk = E[|θ̂ - θ₀|] is called the mean absolute error (MAE)
• When the squared error is used as a loss function, then the risk = E[(θ̂ - θ₀)²] is called the mean squared error (MSE). The square root of the mean squared error is called root mean squared error (RMSE).
1. Unbiasedness
If an estimator produces parameter estimates that are on average correct, then it is said to be unbiased.
Let θ be the true parameter and let θ̂ be an estimator. θ̂ is an unbiased estimator of θ if and only if:
E[θ̂ - θ₀] = 0
Also note that if an estimator is unbiased, this implies that the estimation

error is on average zero: hi hi = = = 0- - -E[e] E θ̂ θ E θ̂ θ θ θ0 0 0 0

EfficiencyThe is a measure of quality of an estimator which essentially means that an estimator is defined efficiency as efficient if it has a small variance or mean squared error, indicating that there is a small deviance between the estimated value and the true value. If θ̂ and θ̂ are two unbiased estimators for the same parameter θ, then the variance can be compared to determine performance. θ̂ is more efficient than θ̂ if the variance of θ is smaller than the variance of θ̂) V ar(T) > V ar(T1) for all the values of θ 0

R - Plots & Proofs (Simulating Efficiency)

set.seed(1) matrix(NA,M <- 100000; n <- 100; mat.y <- nrow = M, ncol = 2); for(i in 1:M){ rnorm(n,y = 5); c(mean(y), median(y)); mat.y[i,] <- }; plot(density(mat.y[, 1]), type = "l", main = ""); lines(density(mat.y[, 2]), col = 2);

Density 210 4.6

4.8 5.0 5.2 5.4
N = 100000
Bandwidth = 0.008989
For a normal random sample, both the sample mean and sample median are consistent estimators of µ. The mean is more efficient.
Consistency
A is an estimator having the property that as the number of data points used increases indefinitely, the resulting sequence of estimates converges in probability to θ. This means that the distribution of the estimates become more and more concentrated near the true value of the parameter being estimated, so that the probability of the estimator being arbitrarily close converges to one. A sequence of estimators (ξ) is said to be consistent if and only if: θ̂_n → θ as n → ∞ where → indicates convergence in probability. The sequence of estimators is said to be strongly consistent if and only if: θ̂_n → θ as n → ∞ almost surely, where → indicates almost sure convergence.

convergence.−−→R - Plots & Proofs (Simulating Consistency) require(MASS); rep(NA,M <- 100000; n1 <- 20; n2 <- 200; y1 <- y2 <- M); for(i in 1:M){ mean(rpois(n1,y1[i] <- 1));mean(rpois(n2,y2[i] <- 1));}; par(mfrow c(1,2)); =hist.scott(y1, c(0, abline(vxlim = 2), main = "", xlab = ""); = 1, col = 2); hist.scott(y2, c(0, abline(vxlim = 2), main = "", xlab = ""); = 1, col = 2); 64 543Density Density 32 21 10 00.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 4Point Estimation of the Mean (Normal and not Normal IID Samples)

The sample is made of independent draws from a probability distribution having unknown meanξ nnand . Specifically, we observe realizations of independent random2 {x }known varianceµ σ n , . . . , x nn1variables all having the same distribution with and .2{X } unknown mean known variance, . . . , X µ σn1The sample is the n-dimensional vector = [x ].ξ , . . . , xn n11. The

estimator ˆAs an estimator of the mean we use the sample mean :µ, Xnn1 X=X̂ X .n in i=12. Expected value of the estimatorˆThe expected value of the estimator is equal to the true meanX µ:n ] =E[ X̂ µnThis can be proved using linearity of the expected value: #" n1ˆ X] = XE[ X E in n i=1n1 X ]= E[Xin i=1n1 1X= = µ nµ µn ni=1ˆTherefore, the estimator of is unbiased.Xn3. Variance of the estimatorˆThe variance of the estimator is:Xn 2σˆ ] =V ar[ Xn nThis can be proved using the formula for the variance of an independent sum:5" #n1ˆ X] =V ar[ X V ar Xn in i=1" #n1 X= V ar Xi2n i=1n1 X ]= V ar[Xi2n i=1n1 1 2σX == = 22 nσσ2 2n n ni=1Therefore, the variance of the estimator tends to zero as the sample size tends to infinity.n4. Risk of the estimatorThe mean squared error of the estimator is: 2σˆ ˆ) = ] =M SE(X V ar[ Xn n nThis is proved as follows: ˆ ˆh i)

= 2|| -M SE( X E X µ||n nˆh i= 2| -E X µ|nˆh i= ( 2-E X µ)nˆ= ]V ar[ Xn2σ= n5. Asymptotic normality

The sequence satisfies the conditions of Lindeberg-Levy Central Limit Theorem (i.e. is an IID{X } {X }n nsequence with finite mean and variance). Therefore, the sample mean {X }n√ -X µ dn −→n Zσdwhere is a standard normal random variable and denotes convergence in distribution. In other word,−→Z 2the sample mean converges in distribution to a normal random variable with mean and variance .σX µn n6

Point Estimation of the Variance (Known Mean and Normal IID Samples)

The sample is made of independent draws from a normal distribution having andknown meanξ n µn . Specifically, we observe realizations of independent random2 {x }unknown variance σ n , . . . , x nn1variables all having a normal distribution with and .2{X } known mean unknown variance, . . . , X

^µ σ_n1The sample is the n-dimensional vector = [x ].ξ , . . . , xn _n11. The estimatorWe use the following estimator of variance: ⁿ¹ X(X=2 2−σ̂ µ)in n _i=12. Expected value of the estimatorThe expected value of the estimator is equal to the true variance :2 2σ̂ σn =2 2 E σ̂ σnThis can be proved using the linearity of the expected value: #

Anteprima

Vedrai una selezione di 3 pagine su 10

Stima Puntuale (a Handbook for Data Scientists with examples in R) Pag. 1

Stima Puntuale (a Handbook for Data Scientists with examples in R) Pag. 2

Anteprima di 3 pagg. su 10.
Scarica il documento per vederlo tutto.

Scarica

Stima Puntuale (a Handbook for Data Scientists with examples in R) Pag. 6

Acquista con carta o PayPal

Scarica i documenti tutte le volte che vuoi

Dettagli

SSD

Scienze economiche e statistiche SECS-S/01 Statistica

I contenuti di questa pagina costituiscono rielaborazioni personali del Publisher victorplesco di informazioni apprese con la frequenza delle lezioni di Elementi di Statistica e studio autonomo di eventuali libri di riferimento in preparazione dell'esame finale o della tesi. Non devono intendersi come materiale ufficiale dell'università Università degli Studi di Roma La Sapienza o del prof Sambucini Valeria.

Appunti correlati

Invia appunti e guadagna

Recensioni

Ti è piaciuto questo appunto?

Stima Puntuale (a Handbook for Data Scientists with examples in R)

R - Plots & Proofs (Simulating Consistency)

Point Estimation of the Variance (Unknown Mean and Normal IID Samples)

Recensioni

Domande e risposte

I migliori insegnanti di Matematica

Giovanni C.

Salvatore F.

Matteo S.