vuoi
o PayPal
tutte le volte che vuoi
COMANDI STATA
function=>you can record the process); copy graph after creation into the word
processor…
Histogram in STATA • graphics=> histogram •
tabmain=> selectthevariableforwhichyouwantto createthegraph –
tick„data are discrete“ (ifthevariableisdiscrete) –
sectionY axis=> selectpercent(forrelativefrequencies) or
frequency(forabsolutefrequencies) •
tabX axis=> insert titleoftheX axis intothebox title •
tabtitles=> insert title, subtitle, note(etc.)
• ifwewantto createa histogram forquantitativedata, theprocedureis
de facto thesame, onlyinsteadof„data are discrete“ wetick„data are
continuous“, orweuse thecommadhistogramvarname, frequency
(wemaycombinethiswithsomeotheroptions): –
histogram varnameifvarname<25, frequencyby(sex) … etc
Bar chart in STATA the best way to do it in STATA is to use the histogramoption: •
graphics=> histogram • tabmain=> selectthevariableforwhichwecreatethegraph –
ticktheoption„data are discrete“ –
in Y axistabweselectwhatwewantto display on theY axis
(percentforrelativefrequency, frequencyforabsolutefrequency…) –
bar properties=> choosebar gap of10 • insert titles, caption, noteetc. in thetitlestab •
tabX axis – clickon major tick/label properties=> tablabels=> tickuse value
labels(by thiswedisplay descriptionsofcategories); ifthedescriptions
are toolarge, wehaveto adjustthemso theyfit intothegraph(wecanlowerthe
size), orwecanchangetheangleofthedescription(angle), orwecanchangethe
rule=> suggest# ofticks(and wechoosea lowernumber) –
fillthebox titleto namethex axis • tabY axis => fillthebox titleto namethey axis •
tabmain=> tickaddheightlabelsto bar(thecategories´valueswillbe addedintothegraph)
Stem‐and‐Leafplot in STATA • Statistics=> Summaries, Tables, and Tests
=> DistributionalPlotsand Tests => Stem‐and‐LeafDisplay –
in thevariablebox selectthevariableforwhichwecreatethegraph –
wecanticktheoption„Do not printstemsthathaveno leaves“ to
excludeemptycategoriesfromthegraph –
in thelinessectionwecanchoosehowmany rowsthegraphshould
have=> clickon submit • two‐waystem‐and‐
leafplot cannotbecreatedin STATA, but atleast we
canrun thecommandforseveralcategoriesofa differentvariable, if
weselectitin thetabby/if/in
Scatterplot in STATA • Graphics=> Twowaygraph… – tabPlots=> clickon Create… •
chooseoptionsBasic plotsand Scatter •
selecttheX variableand theY variable, thensubmit
Describing variabilit and shape in STATA
•one command for all the descriptive statistics (quantiles, mean, standard deviation,
variance and skewness and kurtosis): summarize varname, detail =>output: all of
the above mentioned ones…
•weighted descriptive statistics
–first insert a variable that will represent the weights
–then, summarize varname [fweight=varname_weights], detail=> all provided
statistics are weighted
COMANDI STATA
•if weight maybe used only when the weighting variable is provided in integers(if not
in integers, use a weight instead)
•descriptive statistics when we have data sorted in intervals
–if we know the frequencies in individual intervals(variableni) and centers of those
intervals(variablexs), then the descriptive statistics can be computed by summarize
xs[fweight=ni], detail(oruse aweight if the variable is not provided in integers)
Percentiles in STATA
• calculation of quantiles
–the most frequently used quantiles:
•summarize varname,detail
–the value of a given quantile:_pctile varname, p(10)
•the number within the brackets indicates the quantile
•STATA will compute the value of the quantile into its internal memory=>command
return list(to recall it from the memory)
Box plot in STATA
•graphics=>box plot
• tab main
– choose vertical box plot orhorizontal box plot
– insert the variable for which we create the box plot
•tab categories =>tickGroup 1 and, possibly, insert the variable according to which
we want group the analyzed variable=>by doing this we generate more box plots(if
we wanted only one we would leave this tabun changed)
• tabif/in=>if we want to set a condition…
• tabtitles=> we insert titles, caption, notes etc. into the graph…
STATA: Finding the probability if we know z
• command di normal(z)
–STATA will compute the left‐tail cumulative probability
–see the white area on the picture(i.e. the same as in MS Excel)
–if we want to compute the right‐tail(=left‐tail; symmetry) probability (grey area on
the picture), we have to subtract the result from STATA from1
Summary: population, samples, sampling distribution
1. Population distribution
•distribution, from which we select samples; usually unknown
•we make inferences about its characteristics(such as µ–meanand σ– variability), we
denote the population size by N
2. Sample data distribution
•distribution of data we actually observe(frequency distribution); we can describe
̅
this distribution with descriptive statistics (y ors etc.)
•the larger the sample sizen, the closer the sample data distribution resembles
̅
the population distribution and the closer the sample statistics (such as y) fall to
population parameters (such a as µ)
3. Sampling distribution (of a statistic) ̅
• probability distribution for the possible values of a sample statistic(y)
• describes the variability that occurs in the statistic´s value among samples of a
certain size
•it determines the probability that the statistic falls within a certain distance of the
COMANDI STATA
populaton parameter it estimates
STATA: CI for a population proportion
• ifweestimatetheCI fora populationproportionπ(basedon sample
proportionπ) fora variable, whichisin STATA:
̂
– !ATTENTION!:thisvariablemustbein the0 and 1 format(ifitisnot,
thenwehaveto convertitintothatformat, oruse theSTATA calculat.)
– ifthevariableisin the0 and 1 format, thenweuse thecommand
civariablename, binomiallevel(XX), orusingtheSTATA menu:
Statistics=>Summaries, tablesand tests=>Summaryand Descriptive
Statistics=>Confidenceintervals
• selectthevariableforwhichweestimatetheCI
• variabletype =>Binomialvariables(0/1)
• selecttheconfidencelevel…(+ set conditionsin theby/if/intab)
• ifwedon´thavethevariablein STATA, orthevariableisnot in the0 and 1
format, weuse theSTATA calculatorforCI: Statistics=>Summaries, tables
and tests=>Summaryand DescriptiveStatistics=>BinomialCI Calculator –
weset thesample size(n), numberofcasesthatfallin thecategoryof
ourinterest(= successes), confidenceleveland weselect„exact
STATA: CI for population means ̅
• ifweestimateCI fora populationmeanμ(basedon thesample meany)
fora variable, thatisin STATA:
– commandcivariablename, level(XX)… XX istheconfidencelevel
• cihrs1, level(99) – usingmenu: Statistics=>Summaries, tablesand tests=>Summary
and DescriptiveStatistics=>Confidenceintervals
• wechoosethevariableforwhichweestimatetheCI
• variabletype =>Normalvariables • selecttheconfidencelevel…
(+ conditionsin theby/if/intab) • ifwedon
´thavethevariablein STATA, weuse theSTATA calculatorforCI:
Statistics=>Summaries, tablesand tests=>Summaryand Descriptive
Statistics=>NormalCI Calculator ̅
– weset thesample size(n), valueofthesample mean(y), valueofthe
sample standard deviation(s) and theconfidencelevel
STATA: comparing two proportions
• wewantto investigatethedifferencebetweentwopopulationproportionsπ2
–π1 basedon thekonwledgeofthedifferencebetweentwosample proportionsπ2
̂
–π1
̂
–
1) wewantto estimatetheCI forthedifferenceofthepopulationproportions
–
2) wewantto test thenullhypotheses, thatthedifferencebetweenthe
populationproportionsisequalto zero(i.e. theyare equal):
• H0:π2 –π1 = 0 (tj. π2 = π1) againstH1:π2 –π1 ≠0 (tj. π2 ≠π 1)
• ifwehavebothofthevariablesin STATA:
– prtestproměnná 1== proměnná 2, orusingtheSTATA menu: Statistics=>
Summaries, tablesand tests=>Classicalhypotheses=>Two‐sample proportion
test: theSTATA output containstheCI, as wellas thetest results
• ifweinvestigatethedifferencebetweentwopopulationproportionsoftwo