vuoi
o PayPal
tutte le volte che vuoi
Anova One Way
Anova one way:
Stat → Anova → One way
- Response = outcome of the experiment (Y)
- Factor = process variables that affect the output (in 1 way: 1 factor)
Storage: Residuals
Graph: all graphs
Options: Confidence level = 95%
• BOXPLOT → idea of the distribution of the sample (check outliers)
→• NORMAL PROBABILITY PLOT to understand if the distribution is normal (S line is normal)
• P-VALUE → if < 0.05, we reject the null hypothesis (⟹ exists at least one μ different from the others) and so the factor is significant in terms of impact on the response variable
• STANDARD DEVIATION (S) → standard deviation of the distance between data values and fitted values (budgeted). It tells how well the model describes the response (the lower the better!)
• F-VALUE → MS (factor) / MS (error) (the higher the better) adj adj
• R-SQ(ADJ) (= 1 – (MSerr/MStot)) → percentage of variation in the response that is explained by
- The model. The factors that are included in the model are explaining the “R-sq(adj)”% of the variation that is present in the response variable (a good value is higher than 70%).
- If p-value < 0,05 and R-sq(adj) is very low, it means that we did not take into account other factors that are having an impact on the y variable.
- RSQ = 1 – (SS error / SS total) Usually it is better to consider the Rsq(adj) since it is more conservative as it is taking into consideration the number of predictors and the number of observations that are in the model.
- R-SQ(PRED) tells which is the fit of the model, so which is the capability of the model to predict new variables.
- MEAN the best one is the one that has the highest value because we want to maximize the response variable.
- TUKEY ANALYSIS used to identify some clusters, to be able to say which level of the factor is statistically providing better results in terms of response variable (this
analysis is strictly related to the sample size)Check the 3 hypothesis of the Anova:
- Test of independence
From graph “Versus Order” (residual plotted versus the observation order).
If we don’t see any pattern, we can say that the residuals are independent. - Normality test
Graph → Probability plot → RESI
If the p-value for the test is higher than your chosen α-level , then you must acceptH0 and conclude that your data follow a normal distribution. (⟹ p-value > 0,05)
(in the picture aside we have a p-value>=0,05 and so we have a normal distributionthat is similar to an S)
If normality is not met:
- Non parametric test → Kruskall Wallis (tests whether samples originate from the same distribution)
- Employing a transformation - Test of equal variances
Stat → Anova → Test for equal variances
Response = Resi; Factors = “factor”(Storage: tick “Multiple Comparison Interval”)
If both p-values > 0.05, we can’t
reject the null hypothesis and the test of equal variances is verified.
- Multiple comparisons: more conservative since it takes into account also the number of observations.
- Levene's test
If the test of equal variances is not verified (p-value<0,05), we can redo the Anova unticking "Assume Equal Variances" (on "options") which means to do the Welch's Anova (compares two means to see if they are equal)
ANOVA TWO WAYS
Anova two ways:
Stat → Anova → General linear model → Fit General linear model
- Response = outcome of the experiment (Y)
- Factor = process variables that affect the output (in 2 way: 2 factors)
Model: Select both factors and click "add" to add also the interaction between the factors
Storage: Residuals
- Degrees of freedom:
o DF (factor) = #levels - 1
o DF (interaction) = DF (factor1) * DF (factor2)
o DF (total) = #observations - 1
o DF (error) = DF (total) - DF (error) - DF (for each
- Adj MS = Adj SS / DF
- F-value = Adj MS / Adj MS errori i
- R-sq(adj) = the model explains x% of the variability
Test for normality of residuals:
Graph → Probability plot → Single- Variable: Residual
Residuals = observed values – values estimated by the model
If the p-value < chosen α-level, then you must reject H0 and conclude that your data do not follow a normal distribution.
Test for equal variance:
Stat → Anova → Test for equal variance
- Response = Response
- Factors = the 2 factors
- Options: Confidence level = 95,0; Tick “Use test based on normal distribution” (if residuals are normally distributed)
If p-value > 0,05 we cannot reject the H0 (graphically, the confidence intervals are overlapping) the variances of the residuals are equal.
Main effect plot:
Stat → Anova → Main effect plot
- Response = Response
- Factors = the 2 factors
Interaction plot:
Stat → Anova → Interaction plot
- Response = Response
Factors = the 2 factors Tukey Pairwise comparisons:<ul>
<li>Stat → Anova → General linear model → Comparison</li>
</ul>
Choose terms for comparison: factor 1, factor 2, Interaction
DESIGN OF EXPERIMENT – Full factorial
Create design of experiment:
<ul>
<li>Stat → DOE → Factorial → Create Factorial design</li>
</ul>
Analyse design of experiment:
<ul>
<li>Stat → DOE → Factorial → Analyse Factorial design</li>
</ul>
- Response = response variable
- Graphs: “Normal”, “Pareto”, Normal plot, Residuals vs order
p-value < 0,05 → factors are impacting on the result(in this case also the “constant”)
We can see the significance of the factors also through the “Pareto chart of the standardized effects”
The reference line represents the 5% limit.
Clean the model:
If at least one factor is not significant, we have to clean the model taking them off.
<ul>
<li>Stat → DOE → Factorial → Analyse Factorial design</li>
</ul>
- Response = response variable
- Terms: select only therelevant parameters (according to the previous analysis)
Now all the factors have p-value < 0,05. The difference between the mathematical value (R-sq) and conservative value (R-sq(adj)) is less than before (better).
Lack-of-fit means that the model is poor (maybe when we have some unusual data). It may mean that one observation is not well fitted. In fact, in the “Fits and diagnostic for unusual observation” we can see that for the observation number 5, the expectations in terms of the model is 184, while the result is 199.
Factorial Plot:
Represent the impact of the significant factors on the result
Stat → DOE → Factorial → Factorial Plot
- Response: response variable
- Selected: only relevant factors
Main effects plot & interaction plot
Contour plot
DOE – Factorial – Contour plot
Response optimizer:
DOE → Factorial → Response optimizer
- Goal: select the right goal
- Setup: select “Lower”, “Target” and “Upper”
for the response factord = desiderability (= 1 if we fit exactly the goal). If we go outside thespecifications, d=0.
DESIGN OF EXPERIMENT – Fractional factorial
Create design of experiment:
Stat → DOE → Factorial → Create Factorial Design
- Enter #factors
- Design: “½ fraction”
- Options: Exclude “randomize runs”
- Enter the results in the last column
Analyse design of experiment:
Stat → DOE → Factorial → Analyse Design
- “Normal”, “Pareto”
Clean the model:
If at least one factor is not significant, we have to clean the model taking them off.
Stat → DOE → Factorial → Analyse Factorial design
- Response = response variable
- Terms: select only the relevant parameters (according to the previous analysis)
Factorial plot:
Stat → DOE → Factorial → Factorial plot
Main effects plot & interaction plot
Contour plot:
Stat → DOE → Factorial → Contour plot
- Response:
Settings: optimal value of the hold factor
We consider the two most relevant factors and we hold the third factor
Response Optimizer
Stat → DOE → Factorial → Response optimizer