Hide Solution
$$ \begin{align}
\text{SSW} &= \sum_{i=1}^g \sum_{j=1}^{n_i}\ (x_{i,j} - \bar{x}_i)^2 \\[3em]
&= \sum_{i=1}^{3} \sum_{j=1}^{n_i}\ (x_{i,j} - \bar{x}_i)^2 \\[1em]
&= \sum_{j=1}^{4}\ (x_{1,j} - 4.25)^2 + \sum_{j=1}^{8}\ (x_{2,j} - 0.875)^2 + \sum_{j=1}^{9}\ (x_{3,j} - 3.4444)^2 \\[1em]
&= (x_{1,1} - 4.25)^2\ + (x_{1,2} - 4.25)^2\ + (x_{1,3} - 4.25)^2\ + (x_{1,4} - 4.25)^2\ + \\
& \qquad
(x_{2,1} - 0.875)^2\ + (x_{2,2} - 0.875)^2\ + (x_{2,3} - 0.875)^2\ + (x_{2,4} - 0.875)^2\ + (x_{2,5} - 0.875)^2\ + (x_{2,6} - 0.875)^2\ + (x_{2,7} - 0.875)^2\ + (x_{2,8} - 0.875)^2\ + \\
& \qquad
(x_{3,1} - 3.4444)^2\ +\ (x_{3,2} - 3.4444)^2\ +\ (x_{3,3} - 3.4444)^2\ +\ (x_{3,4} - 3.4444)^2\ +\ (x_{3,5} - 3.4444)^2\ +\ (x_{3,6} - 3.4444)^2\ +\ (x_{3,7} - 3.4444)^2\ +\ (x_{3,8} - 3.4444)^2\ +\ (x_{3,9} - 3.4444)^2 \\[1em]
&= (6 - 4.25)^2\ + (7 - 4.25)^2\ + (6 - 4.25)^2\ + (-2 - 4.25)^2\ + \\
& \qquad
(-10 - 0.875)^2\ + (-17 - 0.875)^2\ + (2 - 0.875)^2\ + (9 - 0.875)^2\ + (2 - 0.875)^2\ + (8 - 0.875)^2\ + (14 - 0.875)^2\ + (-1 - 0.875)^2\ + \\
& \qquad
(2 - 3.4444)^2\ +\ (4 - 3.4444)^2\ +\ (-3 - 3.4444)^2\ +\ (11 - 3.4444)^2\ +\ (9 - 3.4444)^2\ +\ (-4 - 3.4444)^2\ +\ (8 - 3.4444)^2\ +\ (-3 - 3.4444)^2\ +\ (7 - 3.4444)^2 \\[1em]
&= (1.75)^2\ + (2.75)^2\ + (1.75)^2\ + (-6.25)^2\ + \\
& \qquad
(-10.875)^2\ + (-17.875)^2\ + (1.125)^2\ + (8.125)^2\ + (1.125)^2\ + (7.125)^2\ + (13.125)^2\ + (-1.875)^2\ + \\
& \qquad
(-1.4444)^2\ +\ (0.5556)^2\ +\ (-6.4444)^2\ +\ (7.5556)^2\ +\ (5.5556)^2\ +\ (-7.4444)^2\ +\ (4.5556)^2\ +\ (-6.4444)^2\ +\ (3.5556)^2 \\[1em]
&= (3.0625)\ + (7.5625)\ + (3.0625)\ + (39.0625)\ + \\
& \qquad
(118.2656)\ + (319.5156)\ + (1.2656)\ + (66.0156)\ + (1.2656)\ + (50.7656)\ + (172.2656)\ + (3.5156)\ + \\
& \qquad
(2.0864)\ +\ (0.3086)\ +\ (41.5309)\ +\ (57.0864)\ +\ (30.8642)\ +\ (55.4198)\ +\ (20.7531)\ +\ (41.5309)\ +\ (12.642) \\[1em]
&= 52.75\ + 732.875\ + 249.5802 \\[1em]
&= 1047.8472 \\[1em]
\end{align}
$$
From these calculations, the within sum of squares is SSW = 1047.8472.
Hide the R Code
There are two ways of performing these calculations in R. The method you select will depend on how your data are stored.
Method 1: Wide Format
Copy and paste the following code into your R script window, then run it from there.
## Import data
treatment1 = c(6, 7, 6, -2)
treatment2 = c(-10, -17, 2, 9, 2, 8, 14, -1)
treatment3 = c(2, 4, -3, 11, 9, -4, 8, -3, 7)
## Change to Long Format
mmt = c( treatment1, treatment2, treatment3 )
grp = c( rep("trt1",4), rep("trt2",8), rep("trt3",9) )
## Model the data
mod = aov(mmt~grp)
summary(mod)
In the R output, the value of the sum of squares within is the number in the table under Sum Sq
and to the right of Residuals
. If you would like better precision for that value, or if you would like to have only that value, run the following code in addition to that above:
modSummary = summary(mod)
modSummary[[1]][2,2]
Here, the number outputted is the sum of squares between. How did you get the number? The summary table (also known as an ANOVA table) is just a table. Thus, the first line saves the table as the variable modSummary
the last line looks inside that variable, selects the ANOVA table ([[1]]
), and then selects the row 2, column 2 value.
Method 2: Long Format
Copy and paste the following code into your R script window, then run it from there.
## Import data
yields = c(6, 7, 6, -2, -10, -17, 2, 9, 2, 8, 14, -1, 2, 4, -3, 11, 9, -4, 8, -3, 7)
grp = c('trt1', 'trt1', 'trt1', 'trt1', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3')
## Model the data
mod = aov(yields~grp)
summary(mod)
As discussed above, in the R output, the value of the sum of squares within is the number in the table under Sum Sq
and to the right of Residuals
. If you would like better precision for that value, or if you would like to have only that value, run the following code in addition to that above:
modSummary = summary(mod)
modSummary[[1]][2,2]
Here, the number outputted is the sum of squares between. How did you get the number? The summary table (also known as an ANOVA table) is just a table. Thus, the first line saves the table as the variable modSummary
the last line looks inside that variable, selects the ANOVA table ([[1]]
), and then selects the row 2, column 2 value.
Note: The difference between wide and long formats is this: In wide formatted data, each group has its own variable. In long formatted data, the group number is a variable.