Hide Solution
$$ \begin{align}
\text{SSW} &= \sum_{i=1}^g \sum_{j=1}^{n_i}\ (x_{i,j} - \bar{x}_i)^2 \\[3em]
&= \sum_{i=1}^{5} \sum_{j=1}^{n_i}\ (x_{i,j} - \bar{x}_i)^2 \\[1em]
&= \sum_{j=1}^{9}\ (x_{1,j} - 9.6667)^2 + \sum_{j=1}^{10}\ (x_{2,j} - 5.6)^2 + \sum_{j=1}^{6}\ (x_{3,j} - 15.3333)^2 + \sum_{j=1}^{7}\ (x_{4,j} - 11.2857)^2 + \sum_{j=1}^{9}\ (x_{5,j} - 6.2222)^2 \\[1em]
&= (x_{1,1} - 9.6667)^2\ + (x_{1,2} - 9.6667)^2\ + (x_{1,3} - 9.6667)^2\ + (x_{1,4} - 9.6667)^2\ + (x_{1,5} - 9.6667)^2\ + (x_{1,6} - 9.6667)^2\ + (x_{1,7} - 9.6667)^2\ + (x_{1,8} - 9.6667)^2\ + (x_{1,9} - 9.6667)^2\ + \\
& \qquad
(x_{2,1} - 5.6)^2\ + (x_{2,2} - 5.6)^2\ + (x_{2,3} - 5.6)^2\ + (x_{2,4} - 5.6)^2\ + (x_{2,5} - 5.6)^2\ + (x_{2,6} - 5.6)^2\ + (x_{2,7} - 5.6)^2\ + (x_{2,8} - 5.6)^2\ + (x_{2,9} - 5.6)^2\ + (x_{2,10} - 5.6)^2\ + \\
& \qquad
(x_{3,1} - 15.3333)^2\ + (x_{3,2} - 15.3333)^2\ + (x_{3,3} - 15.3333)^2\ + (x_{3,4} - 15.3333)^2\ + (x_{3,5} - 15.3333)^2\ + (x_{3,6} - 15.3333)^2\ + \\
& \qquad
(x_{4,1} - 11.2857)^2\ + (x_{4,2} - 11.2857)^2\ + (x_{4,3} - 11.2857)^2\ + (x_{4,4} - 11.2857)^2\ + (x_{4,5} - 11.2857)^2\ + (x_{4,6} - 11.2857)^2\ + (x_{4,7} - 11.2857)^2\ + \\
& \qquad
(x_{5,1} - 6.2222)^2\ +\ (x_{5,2} - 6.2222)^2\ +\ (x_{5,3} - 6.2222)^2\ +\ (x_{5,4} - 6.2222)^2\ +\ (x_{5,5} - 6.2222)^2\ +\ (x_{5,6} - 6.2222)^2\ +\ (x_{5,7} - 6.2222)^2\ +\ (x_{5,8} - 6.2222)^2\ +\ (x_{5,9} - 6.2222)^2 \\[1em]
&= (6 - 9.6667)^2\ + (-5 - 9.6667)^2\ + (20 - 9.6667)^2\ + (1 - 9.6667)^2\ + (15 - 9.6667)^2\ + (13 - 9.6667)^2\ + (10 - 9.6667)^2\ + (12 - 9.6667)^2\ + (15 - 9.6667)^2\ + \\
& \qquad
(10 - 5.6)^2\ + (2 - 5.6)^2\ + (-3 - 5.6)^2\ + (12 - 5.6)^2\ + (12 - 5.6)^2\ + (-1 - 5.6)^2\ + (17 - 5.6)^2\ + (-8 - 5.6)^2\ + (10 - 5.6)^2\ + (5 - 5.6)^2\ + \\
& \qquad
(23 - 15.3333)^2\ + (8 - 15.3333)^2\ + (8 - 15.3333)^2\ + (23 - 15.3333)^2\ + (11 - 15.3333)^2\ + (19 - 15.3333)^2\ + \\
& \qquad
(9 - 11.2857)^2\ + (21 - 11.2857)^2\ + (18 - 11.2857)^2\ + (11 - 11.2857)^2\ + (17 - 11.2857)^2\ + (3 - 11.2857)^2\ + (0 - 11.2857)^2\ + \\
& \qquad
(7 - 6.2222)^2\ +\ (3 - 6.2222)^2\ +\ (-5 - 6.2222)^2\ +\ (7 - 6.2222)^2\ +\ (15 - 6.2222)^2\ +\ (1 - 6.2222)^2\ +\ (11 - 6.2222)^2\ +\ (16 - 6.2222)^2\ +\ (1 - 6.2222)^2 \\[1em]
&= (-3.6667)^2\ + (-14.6667)^2\ + (10.3333)^2\ + (-8.6667)^2\ + (5.3333)^2\ + (3.3333)^2\ + (0.3333)^2\ + (2.3333)^2\ + (5.3333)^2\ + \\
& \qquad
(4.4)^2\ + (-3.6)^2\ + (-8.6)^2\ + (6.4)^2\ + (6.4)^2\ + (-6.6)^2\ + (11.4)^2\ + (-13.6)^2\ + (4.4)^2\ + (-0.6)^2\ + \\
& \qquad
(7.6667)^2\ + (-7.3333)^2\ + (-7.3333)^2\ + (7.6667)^2\ + (-4.3333)^2\ + (3.6667)^2\ + \\
& \qquad
(-2.2857)^2\ + (9.7143)^2\ + (6.7143)^2\ + (-0.2857)^2\ + (5.7143)^2\ + (-8.2857)^2\ + (-11.2857)^2\ + \\
& \qquad
(0.7778)^2\ +\ (-3.2222)^2\ +\ (-11.2222)^2\ +\ (0.7778)^2\ +\ (8.7778)^2\ +\ (-5.2222)^2\ +\ (4.7778)^2\ +\ (9.7778)^2\ +\ (-5.2222)^2 \\[1em]
&= (13.4444)\ + (215.1111)\ + (106.7778)\ + (75.1111)\ + (28.4444)\ + (11.1111)\ + (0.1111)\ + (5.4444)\ + (28.4444)\ + \\
& \qquad
(19.36)\ + (12.96)\ + (73.96)\ + (40.96)\ + (40.96)\ + (43.56)\ + (129.96)\ + (184.96)\ + (19.36)\ + (0.36)\ + \\
& \qquad
(58.7778)\ + (53.7778)\ + (53.7778)\ + (58.7778)\ + (18.7778)\ + (13.4444)\ + \\
& \qquad
(5.2245)\ + (94.3673)\ + (45.0816)\ + (0.0816)\ + (32.6531)\ + (68.6531)\ + (127.3673)\ + \\
& \qquad
(0.6049)\ +\ (10.3827)\ +\ (125.9383)\ +\ (0.6049)\ +\ (77.0494)\ +\ (27.2716)\ +\ (22.8272)\ +\ (95.6049)\ +\ (27.2716) \\[1em]
&= 484\ + 566.4\ + 257.3333\ + 373.4286\ + 360.284 \\[1em]
&= 2068.7175 \\[1em]
\end{align}
$$
From these calculations, the within sum of squares is SSW = 2068.7175.
Hide the R Code
There are two ways of performing these calculations in R. The method you select will depend on how your data are stored.
Method 1: Wide Format
Copy and paste the following code into your R script window, then run it from there.
## Import data
treatment1 = c(6, -5, 20, 1, 15, 13, 10, 12, 15)
treatment2 = c(10, 2, -3, 12, 12, -1, 17, -8, 10, 5)
treatment3 = c(23, 8, 8, 23, 11, 19)
treatment4 = c(9, 21, 18, 11, 17, 3, 0)
treatment5 = c(7, 3, -5, 7, 15, 1, 11, 16, 1)
## Change to Long Format
mmt = c( treatment1, treatment2, treatment3, treatment4, treatment5 )
grp = c( rep("trt1",9), rep("trt2",10), rep("trt3",6), rep("trt4",7), rep("trt5",9) )
## Model the data
mod = aov(mmt~grp)
summary(mod)
In the R output, the value of the sum of squares within is the number in the table under Sum Sq
and to the right of Residuals
. If you would like better precision for that value, or if you would like to have only that value, run the following code in addition to that above:
modSummary = summary(mod)
modSummary[[1]][2,2]
Here, the number outputted is the sum of squares between. How did you get the number? The summary table (also known as an ANOVA table) is just a table. Thus, the first line saves the table as the variable modSummary
the last line looks inside that variable, selects the ANOVA table ([[1]]
), and then selects the row 2, column 2 value.
Method 2: Long Format
Copy and paste the following code into your R script window, then run it from there.
## Import data
yields = c(6, -5, 20, 1, 15, 13, 10, 12, 15, 10, 2, -3, 12, 12, -1, 17, -8, 10, 5, 23, 8, 8, 23, 11, 19, 9, 21, 18, 11, 17, 3, 0, 7, 3, -5, 7, 15, 1, 11, 16, 1)
grp = c('trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4', 'trt5', 'trt5', 'trt5', 'trt5', 'trt5', 'trt5', 'trt5', 'trt5', 'trt5')
## Model the data
mod = aov(yields~grp)
summary(mod)
As discussed above, in the R output, the value of the sum of squares within is the number in the table under Sum Sq
and to the right of Residuals
. If you would like better precision for that value, or if you would like to have only that value, run the following code in addition to that above:
modSummary = summary(mod)
modSummary[[1]][2,2]
Here, the number outputted is the sum of squares between. How did you get the number? The summary table (also known as an ANOVA table) is just a table. Thus, the first line saves the table as the variable modSummary
the last line looks inside that variable, selects the ANOVA table ([[1]]
), and then selects the row 2, column 2 value.
Note: The difference between wide and long formats is this: In wide formatted data, each group has its own variable. In long formatted data, the group number is a variable.