The Problem
The sample covariance is an nonstandardized measure of the relationship between two numeric variables. It is used to summarize the bivariate relationship. To illustrate the calculations, let us use the following example.
A professor would like to investigate the relationship between student score on the midterm and the final exam. To do this, the professor collected the exam scores from n = 5 students. Using the data below, what is that covariance?
Here is the data. Note that there are two measurements on each unit (student). We are interested in the relationship between those two measurements.
Table of the bivariate data generated
Student  Midterm  Final 
Identifier  Score  Score 
1  86  53 
2  39  21 
3  85  72 
4  28  100 
5  73  72 
Calculate the covariance of this sample.
Your Answer
You got the correct answer of 56.65. Congratulations!
Unfortunately, your answer was not correct. Either try again or click on “Show Solution” below to see how to obtain the correct answer.
Assistance
Hide Solution
$$ \begin{align}
\text{cov}(x,y) &= \frac{1}{n1}\ \sum_{i=1}^{n}\ \left(x_i  \bar{x}\right)\left(y_i  \bar{y}\right) \\[3em]
&= \frac{1}{51}\ \sum_{i=1}^{5}\ \left(x_i  62.2\right)\left(y_i  63.6\right) \\[1em]
&= \frac{1}{4}\ \Big[\ \left(x_1  62.2\right)\left(y_1  63.6\right) \Big. \\
& \qquad + \left(x_2  62.2\right)\left(y_2  63.6\right) \\
& \qquad + \left(x_3  62.2\right)\left(y_3  63.6\right) \\
& \qquad + \left(x_4  62.2\right)\left(y_4  63.6\right) \\
& \qquad + \Big. \left(x_5  62.2\right)\left(y_5  63.6\right)\ \Big] \\[1em]
&= \frac{1}{4}\ \Big[\ \left(86  62.2\right)\left(53  63.6\right)\ \Big. \\
& \qquad + \left(39  62.2\right)\left(21  63.6\right) \\
& \qquad + \left(85  62.2\right)\left(72  63.6\right) \\
& \qquad + \left(28  62.2\right)\left(100  63.6\right) \\
& \qquad + \Big. \left(73  62.2\right)\left(72  63.6\right)\ \Big] \\[1em]
&= \frac{1}{4}\ \Big[ \left(23.8\right)\left(10.6\right) \Big. \\
& \qquad + \left(23.2\right)\left(42.6\right) \\
& \qquad + \left(22.8\right)\left(8.4\right) \\
& \qquad + \left(34.2\right)\left(36.4\right) \\
& \qquad + \Big. \left(10.8\right)\left(8.4\right)\ \Big] \\[1em]
&= \frac{1}{4}\ \Big[\ \left(252.28\right) \Big. \\
& \qquad + \left(988.32\right) \\
& \qquad + \left(191.52\right) \\
& \qquad + \left(1244.88\right) \\
& \qquad + \Big. \left(90.72\right)\ \Big] \\[1em]
&= \frac{1}{4}\ \Big[\ 226.6\ \Big] \\[1em]
\end{align}
$$
And so, the correlation between the midterm and final examination scores in this sample is r = 56.65.
As this value is negative, it indicates that those who scored better on the midterm scored worse on the final.
However, with that said, we do not know if this value indicates a strong relationship or a weak relationship. To draw those conclusions, we should use a standardized measure of relationship. That measurement is called the correlation.
Finally, as with the correlation, we cannot (at this point) draw any conclusion about the relationship between these two variables in the population. This is only a measure of that relationship in this sample.
Hide the R Code
Copy and paste the following code into your R script window, then run it from there.
xvals = c(86, 39, 85, 28, 73)
yvals = c(53, 21, 72, 100, 72)
cov(xvals,yvals)
In the R output, the sample covariance is the number output by the script.
Hide the Excel Code
Copy and paste the following code into your Excel spreadsheet window, making sure your cursor is in A1
when you paste.
Copy and paste the following code into your Excel spreadsheet window, making sure the value xvals
ends up in A1
after pasting.
How to calculate the expected value in Excel.
xvals  yvals   correlation 
86 
53 
cov: 
=COVARIANCE.S(A:A,B:B) 
39 
21 


85 
72 


28 
100 


73 
72 


Make sure that you begin pasting in cell A1
.