Hide Solution
$$ \begin{align}
\text{SSW} &= \sum_{i=1}^g \sum_{j=1}^{n_i}\ (x_{i,j} - \bar{x}_i)^2 \\[3em]
&= \sum_{i=1}^{3} \sum_{j=1}^{n_i}\ (x_{i,j} - \bar{x}_i)^2 \\[1em]
&= \sum_{j=1}^{6}\ (x_{1,j} - 15.1667)^2 + \sum_{j=1}^{11}\ (x_{2,j} - 11.1818)^2 + \sum_{j=1}^{11}\ (x_{3,j} - 12)^2 \\[1em]
&= (x_{1,1} - 15.1667)^2\ + (x_{1,2} - 15.1667)^2\ + (x_{1,3} - 15.1667)^2\ + (x_{1,4} - 15.1667)^2\ + (x_{1,5} - 15.1667)^2\ + (x_{1,6} - 15.1667)^2\ + \\
& \qquad
(x_{2,1} - 11.1818)^2\ + (x_{2,2} - 11.1818)^2\ + (x_{2,3} - 11.1818)^2\ + (x_{2,4} - 11.1818)^2\ + (x_{2,5} - 11.1818)^2\ + (x_{2,6} - 11.1818)^2\ + (x_{2,7} - 11.1818)^2\ + (x_{2,8} - 11.1818)^2\ + (x_{2,9} - 11.1818)^2\ + (x_{2,10} - 11.1818)^2\ + (x_{2,11} - 11.1818)^2\ + \\
& \qquad
(x_{3,1} - 12)^2\ +\ (x_{3,2} - 12)^2\ +\ (x_{3,3} - 12)^2\ +\ (x_{3,4} - 12)^2\ +\ (x_{3,5} - 12)^2\ +\ (x_{3,6} - 12)^2\ +\ (x_{3,7} - 12)^2\ +\ (x_{3,8} - 12)^2\ +\ (x_{3,9} - 12)^2\ +\ (x_{3,10} - 12)^2\ +\ (x_{3,11} - 12)^2 \\[1em]
&= (15 - 15.1667)^2\ + (23 - 15.1667)^2\ + (14 - 15.1667)^2\ + (15 - 15.1667)^2\ + (13 - 15.1667)^2\ + (11 - 15.1667)^2\ + \\
& \qquad
(10 - 11.1818)^2\ + (19 - 11.1818)^2\ + (3 - 11.1818)^2\ + (19 - 11.1818)^2\ + (9 - 11.1818)^2\ + (-1 - 11.1818)^2\ + (13 - 11.1818)^2\ + (16 - 11.1818)^2\ + (8 - 11.1818)^2\ + (6 - 11.1818)^2\ + (21 - 11.1818)^2\ + \\
& \qquad
(15 - 12)^2\ +\ (15 - 12)^2\ +\ (4 - 12)^2\ +\ (11 - 12)^2\ +\ (-2 - 12)^2\ +\ (15 - 12)^2\ +\ (23 - 12)^2\ +\ (17 - 12)^2\ +\ (9 - 12)^2\ +\ (13 - 12)^2\ +\ (12 - 12)^2 \\[1em]
&= (-0.1667)^2\ + (7.8333)^2\ + (-1.1667)^2\ + (-0.1667)^2\ + (-2.1667)^2\ + (-4.1667)^2\ + \\
& \qquad
(-1.1818)^2\ + (7.8182)^2\ + (-8.1818)^2\ + (7.8182)^2\ + (-2.1818)^2\ + (-12.1818)^2\ + (1.8182)^2\ + (4.8182)^2\ + (-3.1818)^2\ + (-5.1818)^2\ + (9.8182)^2\ + \\
& \qquad
(3)^2\ +\ (3)^2\ +\ (-8)^2\ +\ (-1)^2\ +\ (-14)^2\ +\ (3)^2\ +\ (11)^2\ +\ (5)^2\ +\ (-3)^2\ +\ (1)^2\ +\ (0)^2 \\[1em]
&= (0.0278)\ + (61.3611)\ + (1.3611)\ + (0.0278)\ + (4.6944)\ + (17.3611)\ + \\
& \qquad
(1.3967)\ + (61.124)\ + (66.9421)\ + (61.124)\ + (4.7603)\ + (148.3967)\ + (3.3058)\ + (23.2149)\ + (10.124)\ + (26.8512)\ + (96.3967)\ + \\
& \qquad
(9)\ +\ (9)\ +\ (64)\ +\ (1)\ +\ (196)\ +\ (9)\ +\ (121)\ +\ (25)\ +\ (9)\ +\ (1)\ +\ (0) \\[1em]
&= 84.8333\ + 503.6364\ + 444 \\[1em]
&= 1032.4697 \\[1em]
\end{align}
$$
From these calculations, the within sum of squares is SSW = 1032.4697.
Hide the R Code
There are two ways of performing these calculations in R. The method you select will depend on how your data are stored.
Method 1: Wide Format
Copy and paste the following code into your R script window, then run it from there.
## Import data
treatment1 = c(15, 23, 14, 15, 13, 11)
treatment2 = c(10, 19, 3, 19, 9, -1, 13, 16, 8, 6, 21)
treatment3 = c(15, 15, 4, 11, -2, 15, 23, 17, 9, 13, 12)
## Change to Long Format
mmt = c( treatment1, treatment2, treatment3 )
grp = c( rep("trt1",6), rep("trt2",11), rep("trt3",11) )
## Model the data
mod = aov(mmt~grp)
summary(mod)
In the R output, the value of the sum of squares within is the number in the table under Sum Sq
and to the right of Residuals
. If you would like better precision for that value, or if you would like to have only that value, run the following code in addition to that above:
modSummary = summary(mod)
modSummary[[1]][2,2]
Here, the number outputted is the sum of squares between. How did you get the number? The summary table (also known as an ANOVA table) is just a table. Thus, the first line saves the table as the variable modSummary
the last line looks inside that variable, selects the ANOVA table ([[1]]
), and then selects the row 2, column 2 value.
Method 2: Long Format
Copy and paste the following code into your R script window, then run it from there.
## Import data
yields = c(15, 23, 14, 15, 13, 11, 10, 19, 3, 19, 9, -1, 13, 16, 8, 6, 21, 15, 15, 4, 11, -2, 15, 23, 17, 9, 13, 12)
grp = c('trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3')
## Model the data
mod = aov(yields~grp)
summary(mod)
As discussed above, in the R output, the value of the sum of squares within is the number in the table under Sum Sq
and to the right of Residuals
. If you would like better precision for that value, or if you would like to have only that value, run the following code in addition to that above:
modSummary = summary(mod)
modSummary[[1]][2,2]
Here, the number outputted is the sum of squares between. How did you get the number? The summary table (also known as an ANOVA table) is just a table. Thus, the first line saves the table as the variable modSummary
the last line looks inside that variable, selects the ANOVA table ([[1]]
), and then selects the row 2, column 2 value.
Note: The difference between wide and long formats is this: In wide formatted data, each group has its own variable. In long formatted data, the group number is a variable.