Hide Solution
$$ \begin{align}
\text{SSW} &= \sum_{i=1}^g \sum_{j=1}^{n_i}\ (x_{i,j} - \bar{x}_i)^2 \\[3em]
&= \sum_{i=1}^{4} \sum_{j=1}^{n_i}\ (x_{i,j} - \bar{x}_i)^2 \\[1em]
&= \sum_{j=1}^{11}\ (x_{1,j} - 8.0909)^2 + \sum_{j=1}^{10}\ (x_{2,j} - 8.5)^2 + \sum_{j=1}^{14}\ (x_{3,j} - 7.9286)^2 + \sum_{j=1}^{11}\ (x_{4,j} - 7.2727)^2 \\[1em]
&= (x_{1,1} - 8.0909)^2\ + (x_{1,2} - 8.0909)^2\ + (x_{1,3} - 8.0909)^2\ + (x_{1,4} - 8.0909)^2\ + (x_{1,5} - 8.0909)^2\ + (x_{1,6} - 8.0909)^2\ + (x_{1,7} - 8.0909)^2\ + (x_{1,8} - 8.0909)^2\ + (x_{1,9} - 8.0909)^2\ + (x_{1,10} - 8.0909)^2\ + (x_{1,11} - 8.0909)^2\ + \\
& \qquad
(x_{2,1} - 8.5)^2\ + (x_{2,2} - 8.5)^2\ + (x_{2,3} - 8.5)^2\ + (x_{2,4} - 8.5)^2\ + (x_{2,5} - 8.5)^2\ + (x_{2,6} - 8.5)^2\ + (x_{2,7} - 8.5)^2\ + (x_{2,8} - 8.5)^2\ + (x_{2,9} - 8.5)^2\ + (x_{2,10} - 8.5)^2\ + \\
& \qquad
(x_{3,1} - 7.9286)^2\ + (x_{3,2} - 7.9286)^2\ + (x_{3,3} - 7.9286)^2\ + (x_{3,4} - 7.9286)^2\ + (x_{3,5} - 7.9286)^2\ + (x_{3,6} - 7.9286)^2\ + (x_{3,7} - 7.9286)^2\ + (x_{3,8} - 7.9286)^2\ + (x_{3,9} - 7.9286)^2\ + (x_{3,10} - 7.9286)^2\ + (x_{3,11} - 7.9286)^2\ + (x_{3,12} - 7.9286)^2\ + (x_{3,13} - 7.9286)^2\ + (x_{3,14} - 7.9286)^2\ + \\
& \qquad
(x_{4,1} - 7.2727)^2\ +\ (x_{4,2} - 7.2727)^2\ +\ (x_{4,3} - 7.2727)^2\ +\ (x_{4,4} - 7.2727)^2\ +\ (x_{4,5} - 7.2727)^2\ +\ (x_{4,6} - 7.2727)^2\ +\ (x_{4,7} - 7.2727)^2\ +\ (x_{4,8} - 7.2727)^2\ +\ (x_{4,9} - 7.2727)^2\ +\ (x_{4,10} - 7.2727)^2\ +\ (x_{4,11} - 7.2727)^2 \\[1em]
&= (8 - 8.0909)^2\ + (9 - 8.0909)^2\ + (8 - 8.0909)^2\ + (6 - 8.0909)^2\ + (8 - 8.0909)^2\ + (7 - 8.0909)^2\ + (9 - 8.0909)^2\ + (9 - 8.0909)^2\ + (10 - 8.0909)^2\ + (10 - 8.0909)^2\ + (5 - 8.0909)^2\ + \\
& \qquad
(9 - 8.5)^2\ + (8 - 8.5)^2\ + (8 - 8.5)^2\ + (6 - 8.5)^2\ + (8 - 8.5)^2\ + (11 - 8.5)^2\ + (10 - 8.5)^2\ + (6 - 8.5)^2\ + (9 - 8.5)^2\ + (10 - 8.5)^2\ + \\
& \qquad
(8 - 7.9286)^2\ + (4 - 7.9286)^2\ + (10 - 7.9286)^2\ + (11 - 7.9286)^2\ + (8 - 7.9286)^2\ + (8 - 7.9286)^2\ + (5 - 7.9286)^2\ + (10 - 7.9286)^2\ + (5 - 7.9286)^2\ + (10 - 7.9286)^2\ + (10 - 7.9286)^2\ + (10 - 7.9286)^2\ + (7 - 7.9286)^2\ + (5 - 7.9286)^2\ + \\
& \qquad
(8 - 7.2727)^2\ +\ (10 - 7.2727)^2\ +\ (7 - 7.2727)^2\ +\ (5 - 7.2727)^2\ +\ (6 - 7.2727)^2\ +\ (7 - 7.2727)^2\ +\ (7 - 7.2727)^2\ +\ (7 - 7.2727)^2\ +\ (8 - 7.2727)^2\ +\ (6 - 7.2727)^2\ +\ (9 - 7.2727)^2 \\[1em]
&= (-0.0909)^2\ + (0.9091)^2\ + (-0.0909)^2\ + (-2.0909)^2\ + (-0.0909)^2\ + (-1.0909)^2\ + (0.9091)^2\ + (0.9091)^2\ + (1.9091)^2\ + (1.9091)^2\ + (-3.0909)^2\ + \\
& \qquad
(0.5)^2\ + (-0.5)^2\ + (-0.5)^2\ + (-2.5)^2\ + (-0.5)^2\ + (2.5)^2\ + (1.5)^2\ + (-2.5)^2\ + (0.5)^2\ + (1.5)^2\ + \\
& \qquad
(0.0714)^2\ + (-3.9286)^2\ + (2.0714)^2\ + (3.0714)^2\ + (0.0714)^2\ + (0.0714)^2\ + (-2.9286)^2\ + (2.0714)^2\ + (-2.9286)^2\ + (2.0714)^2\ + (2.0714)^2\ + (2.0714)^2\ + (-0.9286)^2\ + (-2.9286)^2\ + \\
& \qquad
(0.7273)^2\ +\ (2.7273)^2\ +\ (-0.2727)^2\ +\ (-2.2727)^2\ +\ (-1.2727)^2\ +\ (-0.2727)^2\ +\ (-0.2727)^2\ +\ (-0.2727)^2\ +\ (0.7273)^2\ +\ (-1.2727)^2\ +\ (1.7273)^2 \\[1em]
&= (0.0083)\ + (0.8264)\ + (0.0083)\ + (4.3719)\ + (0.0083)\ + (1.1901)\ + (0.8264)\ + (0.8264)\ + (3.6446)\ + (3.6446)\ + (9.5537)\ + \\
& \qquad
(0.25)\ + (0.25)\ + (0.25)\ + (6.25)\ + (0.25)\ + (6.25)\ + (2.25)\ + (6.25)\ + (0.25)\ + (2.25)\ + \\
& \qquad
(0.0051)\ + (15.4337)\ + (4.2908)\ + (9.4337)\ + (0.0051)\ + (0.0051)\ + (8.5765)\ + (4.2908)\ + (8.5765)\ + (4.2908)\ + (4.2908)\ + (4.2908)\ + (0.8622)\ + (8.5765)\ + \\
& \qquad
(0.5289)\ +\ (7.438)\ +\ (0.0744)\ +\ (5.1653)\ +\ (1.6198)\ +\ (0.0744)\ +\ (0.0744)\ +\ (0.0744)\ +\ (0.5289)\ +\ (1.6198)\ +\ (2.9835) \\[1em]
&= 24.9091\ + 24.5\ + 72.9286\ + 17.1983 \\[1em]
&= 142.5195 \\[1em]
\end{align}
$$
From these calculations, the within sum of squares is SSW = 142.5195.
Hide the R Code
There are two ways of performing these calculations in R. The method you select will depend on how your data are stored.
Method 1: Wide Format
Copy and paste the following code into your R script window, then run it from there.
## Import data
treatment1 = c(8, 9, 8, 6, 8, 7, 9, 9, 10, 10, 5)
treatment2 = c(9, 8, 8, 6, 8, 11, 10, 6, 9, 10)
treatment3 = c(8, 4, 10, 11, 8, 8, 5, 10, 5, 10, 10, 10, 7, 5)
treatment4 = c(8, 10, 7, 5, 6, 7, 7, 7, 8, 6, 9)
## Change to Long Format
mmt = c( treatment1, treatment2, treatment3, treatment4 )
grp = c( rep("trt1",11), rep("trt2",10), rep("trt3",14), rep("trt4",11) )
## Model the data
mod = aov(mmt~grp)
summary(mod)
In the R output, the value of the sum of squares within is the number in the table under Sum Sq
and to the right of Residuals
. If you would like better precision for that value, or if you would like to have only that value, run the following code in addition to that above:
modSummary = summary(mod)
modSummary[[1]][2,2]
Here, the number outputted is the sum of squares between. How did you get the number? The summary table (also known as an ANOVA table) is just a table. Thus, the first line saves the table as the variable modSummary
the last line looks inside that variable, selects the ANOVA table ([[1]]
), and then selects the row 2, column 2 value.
Method 2: Long Format
Copy and paste the following code into your R script window, then run it from there.
## Import data
yields = c(8, 9, 8, 6, 8, 7, 9, 9, 10, 10, 5, 9, 8, 8, 6, 8, 11, 10, 6, 9, 10, 8, 4, 10, 11, 8, 8, 5, 10, 5, 10, 10, 10, 7, 5, 8, 10, 7, 5, 6, 7, 7, 7, 8, 6, 9)
grp = c('trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt1', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt2', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt3', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4', 'trt4')
## Model the data
mod = aov(yields~grp)
summary(mod)
As discussed above, in the R output, the value of the sum of squares within is the number in the table under Sum Sq
and to the right of Residuals
. If you would like better precision for that value, or if you would like to have only that value, run the following code in addition to that above:
modSummary = summary(mod)
modSummary[[1]][2,2]
Here, the number outputted is the sum of squares between. How did you get the number? The summary table (also known as an ANOVA table) is just a table. Thus, the first line saves the table as the variable modSummary
the last line looks inside that variable, selects the ANOVA table ([[1]]
), and then selects the row 2, column 2 value.
Note: The difference between wide and long formats is this: In wide formatted data, each group has its own variable. In long formatted data, the group number is a variable.