Hide Solution
$$ \begin{align}
\text{SSW} &= \sum_{i=1}^g \sum_{j=1}^{n_i}\ (x_{i,j} - \bar{x}_i)^2 \\[3em]
&= \sum_{i=1}^{4} \sum_{j=1}^{n_i}\ (x_{i,j} - \bar{x}_i)^2 \\[1em]
&= \sum_{j=1}^{8}\ (x_{1,j} - 10.75)^2 + \sum_{j=1}^{9}\ (x_{2,j} - 11.8889)^2 + \sum_{j=1}^{6}\ (x_{3,j} - 7.8333)^2 + \sum_{j=1}^{10}\ (x_{4,j} - 12.2)^2 \\[1em]
&= (x_{1,1} - 10.75)^2\ + (x_{1,2} - 10.75)^2\ + (x_{1,3} - 10.75)^2\ + (x_{1,4} - 10.75)^2\ + (x_{1,5} - 10.75)^2\ + (x_{1,6} - 10.75)^2\ + (x_{1,7} - 10.75)^2\ + (x_{1,8} - 10.75)^2\ + \\
& \qquad
(x_{2,1} - 11.8889)^2\ + (x_{2,2} - 11.8889)^2\ + (x_{2,3} - 11.8889)^2\ + (x_{2,4} - 11.8889)^2\ + (x_{2,5} - 11.8889)^2\ + (x_{2,6} - 11.8889)^2\ + (x_{2,7} - 11.8889)^2\ + (x_{2,8} - 11.8889)^2\ + (x_{2,9} - 11.8889)^2\ + \\
& \qquad
(x_{3,1} - 7.8333)^2\ + (x_{3,2} - 7.8333)^2\ + (x_{3,3} - 7.8333)^2\ + (x_{3,4} - 7.8333)^2\ + (x_{3,5} - 7.8333)^2\ + (x_{3,6} - 7.8333)^2\ + \\
& \qquad
(x_{4,1} - 12.2)^2\ +\ (x_{4,2} - 12.2)^2\ +\ (x_{4,3} - 12.2)^2\ +\ (x_{4,4} - 12.2)^2\ +\ (x_{4,5} - 12.2)^2\ +\ (x_{4,6} - 12.2)^2\ +\ (x_{4,7} - 12.2)^2\ +\ (x_{4,8} - 12.2)^2\ +\ (x_{4,9} - 12.2)^2\ +\ (x_{4,10} - 12.2)^2 \\[1em]
&= (13 - 10.75)^2\ + (13 - 10.75)^2\ + (9 - 10.75)^2\ + (1 - 10.75)^2\ + (16 - 10.75)^2\ + (17 - 10.75)^2\ + (3 - 10.75)^2\ + (14 - 10.75)^2\ + \\
& \qquad
(18 - 11.8889)^2\ + (20 - 11.8889)^2\ + (6 - 11.8889)^2\ + (9 - 11.8889)^2\ + (9 - 11.8889)^2\ + (11 - 11.8889)^2\ + (15 - 11.8889)^2\ + (1 - 11.8889)^2\ + (18 - 11.8889)^2\ + \\
& \qquad
(5 - 7.8333)^2\ + (9 - 7.8333)^2\ + (4 - 7.8333)^2\ + (10 - 7.8333)^2\ + (13 - 7.8333)^2\ + (6 - 7.8333)^2\ + \\
& \qquad
(1 - 12.2)^2\ +\ (5 - 12.2)^2\ +\ (11 - 12.2)^2\ +\ (6 - 12.2)^2\ +\ (18 - 12.2)^2\ +\ (19 - 12.2)^2\ +\ (15 - 12.2)^2\ +\ (23 - 12.2)^2\ +\ (15 - 12.2)^2\ +\ (9 - 12.2)^2 \\[1em]
&= (2.25)^2\ + (2.25)^2\ + (-1.75)^2\ + (-9.75)^2\ + (5.25)^2\ + (6.25)^2\ + (-7.75)^2\ + (3.25)^2\ + \\
& \qquad
(6.1111)^2\ + (8.1111)^2\ + (-5.8889)^2\ + (-2.8889)^2\ + (-2.8889)^2\ + (-0.8889)^2\ + (3.1111)^2\ + (-10.8889)^2\ + (6.1111)^2\ + \\
& \qquad
(-2.8333)^2\ + (1.1667)^2\ + (-3.8333)^2\ + (2.1667)^2\ + (5.1667)^2\ + (-1.8333)^2\ + \\
& \qquad
(-11.2)^2\ +\ (-7.2)^2\ +\ (-1.2)^2\ +\ (-6.2)^2\ +\ (5.8)^2\ +\ (6.8)^2\ +\ (2.8)^2\ +\ (10.8)^2\ +\ (2.8)^2\ +\ (-3.2)^2 \\[1em]
&= (5.0625)\ + (5.0625)\ + (3.0625)\ + (95.0625)\ + (27.5625)\ + (39.0625)\ + (60.0625)\ + (10.5625)\ + \\
& \qquad
(37.3457)\ + (65.7901)\ + (34.679)\ + (8.3457)\ + (8.3457)\ + (0.7901)\ + (9.679)\ + (118.5679)\ + (37.3457)\ + \\
& \qquad
(8.0278)\ + (1.3611)\ + (14.6944)\ + (4.6944)\ + (26.6944)\ + (3.3611)\ + \\
& \qquad
(125.44)\ +\ (51.84)\ +\ (1.44)\ +\ (38.44)\ +\ (33.64)\ +\ (46.24)\ +\ (7.84)\ +\ (116.64)\ +\ (7.84)\ +\ (10.24) \\[1em]
&= 245.5\ + 320.8889\ + 58.8333\ + 429.36 \\[1em]
&= 1064.8222 \\[1em]
\end{align}
$$
From these calculations, the within sum of squares is SSW = 1064.8222.
Hide the R Code
There are two ways of performing these calculations in R. The method you select will depend on how your data are stored.
Method 1: Wide Format
Copy and paste the following code into your R script window, then run it from there.
## Import data
treatment1 = c(13, 13, 9, 1, 16, 17, 3, 14)
treatment2 = c(18, 20, 6, 9, 9, 11, 15, 1, 18)
treatment3 = c(5, 9, 4, 10, 13, 6)
treatment4 = c(1, 5, 11, 6, 18, 19, 15, 23, 15, 9)
## Change to Long Format
mmt = c( treatment1, treatment2, treatment3, treatment4 )
grp = c( rep("trt1",8), rep("trt2",9), rep("trt3",6), rep("trt4",10) )
## Model the data
mod = aov(mmt~grp)
summary(mod)
In the R output, the value of the sum of squares within is the number in the table under Sum Sq
and to the right of Residuals
. If you would like better precision for that value, or if you would like to have only that value, run the following code in addition to that above:
modSummary = summary(mod)
modSummary[[1]][2,2]
Here, the number outputted is the sum of squares between. How did you get the number? The summary table (also known as an ANOVA table) is just a table. Thus, the first line saves the table as the variable modSummary
the last line looks inside that variable, selects the ANOVA table ([[1]]
), and then selects the row 2, column 2 value.
Method 2: Long Format
Copy and paste the following code into your R script window, then run it from there.
## Import data
yields = c(13, 13, 9, 1, 16, 17, 3, 14, 18, 20, 6, 9, 9, 11, 15, 1, 18, 5, 9, 4, 10, 13, 6, 1, 5, 11, 6, 18, 19, 15, 23, 15, 9)
grp = c('trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4')
## Model the data
mod = aov(yields~grp)
summary(mod)
As discussed above, in the R output, the value of the sum of squares within is the number in the table under Sum Sq
and to the right of Residuals
. If you would like better precision for that value, or if you would like to have only that value, run the following code in addition to that above:
modSummary = summary(mod)
modSummary[[1]][2,2]
Here, the number outputted is the sum of squares between. How did you get the number? The summary table (also known as an ANOVA table) is just a table. Thus, the first line saves the table as the variable modSummary
the last line looks inside that variable, selects the ANOVA table ([[1]]
), and then selects the row 2, column 2 value.
Note: The difference between wide and long formats is this: In wide formatted data, each group has its own variable. In long formatted data, the group number is a variable.