Estratto del documento

DESCRIPTIVE STATISTICS

DEFINITION

-

- Population : Total collection of all the elements that we are interested in

- Statistical units : Single elements of the population

- Variable : Features or characteristics of statistical units

- Sample : Subgroup of the population that we are able to study in detail

VARIABLES TYPE

↙ ↘

Quantitative (numerical) Qualitative (categorical)

↙ ↘

continuous discrete ordinal nominal

(weight, height, age) (number of) (disag., neut., ag.) (alive or dead)

mean yes no no

median yes yes no

mode yes yes yes

graph line graph, bar plot, frequency polygon bar chart (abs. or rel. f.) or pie chart (rel. f.)

CLASSES

- Pairs of values that have some relationship to each other → (x, y)

x qualitative and y quantitative → distinct histograms, one for each category of x

→ x and y quantitative → scatter plot (↗ positive | ↘ negative correlation)

- GROUPED DATA : Classes interval use : [ ; ) → “having n to n means that n - n is the interval”

1 2 2 1

Take the mid value in order to compute the sample mean [n = mid-value = average of the extremes]

→ FREQUENCY

SYMMETRY - ↘

x is symmetric : Frequency : Relative frequency :

0

- f

frequencies x - c = x + c for any c f = n · w

→ 0 0 =

w

mode = m = ×

→ n

x SAMPLE STATISTICS

- CENTRALITY MEASURES VARIATION - SPREAD

- -

↙ SAMPLE MEDIAN SAMPLE MODE

SAMPLE MEAN SAMPLE VARIANCE STANDARD DEVIATION

- the data value that

N ni=1 2 s2

Σ x Order from x2i

(x + ... + xn) 1.

i

i=1 s =

1 Σ - n×

occurs most

× = = 2

smallest to larger s =

n n frequently n-1

n odd : x

→ -

1

n/2 +1

n - 1 +1

2 POPULATION

n even : sample covariance

→ SAMPLE RANGE

weighted mean n n×y2

× → µ maximum - minimum

x + x Σ x y -

k out

n Othello

n/2 n/2(+1)

n i

i=1 i

+1

Σ f x 2 2

i i

i=1 s → σ

ki=1 sxy =

2 2 larger range → l. var.

× = = Σ wi xi 2 n-1

P → p

^

n MEAN - MEDIAN RELATIONSHIP CORRELATION

sxy Σ xi yi - n · × y

symmetric right-skewed lef-skewed r = =

× = m × > m × < m sx · sy

x x x 2) y2)

x2i

(Σ - n · × · (Σ yi2 - n ·

BOX-PLOT

- variability index

1. Median 2. First-third quartile 3. IQR 4. Whiskers

→ n/2 or n · 0.5 → 25 p. = Q = n · 0.25 → IQR = 75 p. - 25 p. → LW = 25 p. - 1.5 · IQR

th th th th

1

→ 75 p. = Q = n · 0.75 → UW = 75 p. + 1.5 · IQR

th th

3

SAMPLE PERCENTILES LINEAR TRANSFORMATIONS

- -

- NORMAL DATA

-

- Data set normal if histogram has :

Highest at the middle interval

→ (mode = sample mean = median)

Bell-shaped

→ Symmetry in middle interval

→ PROBABILITY THEORY

RANDOM VARIABLES

- Support S : set of possible values which X can take

x

- DISCRETE RANDOM VARIABLES : take only integer values CONTINUOUS RANDOM VARIABLES : real (decimal) values

Probability (discrete r.v) function Density (continuous r.v) function

→ →

Pr(X = x,Y = y) = Pr(X = x)·Pr(Y = y) f (x,y) = f (x) · f (y)

→ → X,Y X Y

PROPERTIES E(·) and Var(·) NORMAL R.V.

Anteprima
Vedrai una selezione di 1 pagina su 4
Statistics (statistica) - formulario Pag. 1
1 su 4
D/illustrazione/soddisfatti o rimborsati
Acquista con carta o PayPal
Scarica i documenti tutte le volte che vuoi
Dettagli
SSD
Scienze economiche e statistiche SECS-S/01 Statistica

I contenuti di questa pagina costituiscono rielaborazioni personali del Publisher EMMAMNRT di informazioni apprese con la frequenza delle lezioni di Statistics e studio autonomo di eventuali libri di riferimento in preparazione dell'esame finale o della tesi. Non devono intendersi come materiale ufficiale dell'università Università degli studi Ca' Foscari di Venezia o del prof Bussoli Ilaria.
Appunti correlati Invia appunti e guadagna

Domande e risposte

Hai bisogno di aiuto?
Chiedi alla community