By Wingfeet
(This article was originally published at Wiekvoet, and syndicated at StatsBlogs.)
I was browsing Davies Design and Analysis of Industrial Experiments (second edition, 1967). Published by for ICI in times when industry did that kind of thing. It is quite an applied book. On page 107 there is an example where the variance of a process is estimated.
Data
Data is from nine batches from which three samples were selected (A, B and C) and each a duplicate measurement. I am not sure about copyright of these data, so I will not reprint the data here. The problem is to determine the measurement ans sampling error in a chemical process.
ggplot(r4,aes(x=Sample,y=x))+
geom_point()+
facet_wrap(~ batch )
Analysis
At the time of writing the book, the only approach was to do a classical ANOVA and calculate the estimates from there. Source:: statsblogs.com
aov(x~ batch + batch:Sample,data=r4) %>%
anova
Analysis of Variance Table
Response: x
Df Sum Sq Mean Sq F value Pr(>F)
batch 8 792.88 99.110 132.6710
batch:Sample 18 25.30 1.406 1.8818 0.06675 .
Residuals 27 20.17 0.747
—
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
In this case the residual variation is 0.75. The batch:Sample variation estimates is, due to the design, twice the sapling variation plus residual variation. Hence it is estimated as 0.33. How lucky we are to have tools (lme4) which can do this estimate directly. In this case, as it was a well designed experiment, …read more