The Problem
The sample covariance is an nonstandardized measure of the relationship between two numeric variables. It is used to summarize the bivariate relationship. To illustrate the calculations, let us use the following example.
A professor would like to investigate the relationship between student score on the midterm and the final exam. To do this, the professor collected the exam scores from n = 5 students. Using the data below, what is that covariance?
Here is the data. Note that there are two measurements on each unit (student). We are interested in the relationship between those two measurements.
Table of the bivariate data generated
Student  Midterm  Final 
Identifier  Score  Score 
1  50  74 
2  64  99 
3  16  12 
4  3  16 
5  73  37 
Calculate the covariance of this sample.
Your Answer
You got the correct answer of 792.85. Congratulations!
Unfortunately, your answer was not correct. Either try again or click on “Show Solution” below to see how to obtain the correct answer.
Assistance
Hide Solution
$$ \begin{align}
\text{cov}(x,y) &= \frac{1}{n1}\ \sum_{i=1}^{n}\ \left(x_i  \bar{x}\right)\left(y_i  \bar{y}\right) \\[3em]
&= \frac{1}{51}\ \sum_{i=1}^{5}\ \left(x_i  41.2\right)\left(y_i  47.6\right) \\[1em]
&= \frac{1}{4}\ \Big[\ \left(x_1  41.2\right)\left(y_1  47.6\right) \Big. \\
& \qquad + \left(x_2  41.2\right)\left(y_2  47.6\right) \\
& \qquad + \left(x_3  41.2\right)\left(y_3  47.6\right) \\
& \qquad + \left(x_4  41.2\right)\left(y_4  47.6\right) \\
& \qquad + \Big. \left(x_5  41.2\right)\left(y_5  47.6\right)\ \Big] \\[1em]
&= \frac{1}{4}\ \Big[\ \left(50  41.2\right)\left(74  47.6\right)\ \Big. \\
& \qquad + \left(64  41.2\right)\left(99  47.6\right) \\
& \qquad + \left(16  41.2\right)\left(12  47.6\right) \\
& \qquad + \left(3  41.2\right)\left(16  47.6\right) \\
& \qquad + \Big. \left(73  41.2\right)\left(37  47.6\right)\ \Big] \\[1em]
&= \frac{1}{4}\ \Big[ \left(8.8\right)\left(26.4\right) \Big. \\
& \qquad + \left(22.8\right)\left(51.4\right) \\
& \qquad + \left(25.2\right)\left(35.6\right) \\
& \qquad + \left(38.2\right)\left(31.6\right) \\
& \qquad + \Big. \left(31.8\right)\left(10.6\right)\ \Big] \\[1em]
&= \frac{1}{4}\ \Big[\ \left(232.32\right) \Big. \\
& \qquad + \left(1171.92\right) \\
& \qquad + \left(897.12\right) \\
& \qquad + \left(1207.12\right) \\
& \qquad + \Big. \left(337.08\right)\ \Big] \\[1em]
&= \frac{1}{4}\ \Big[\ 3171.4\ \Big] \\[1em]
\end{align}
$$
And so, the correlation between the midterm and final examination scores in this sample is r = 792.85.
As this value is positive, it indicates that those who scored better on the midterm also scored better on the final.
However, with that said, we do not know if this value indicates a strong relationship or a weak relationship. To draw those conclusions, we should use a standardized measure of relationship. That measurement is called the correlation.
Finally, as with the correlation, we cannot (at this point) draw any conclusion about the relationship between these two variables in the population. This is only a measure of that relationship in this sample.
Hide the R Code
Copy and paste the following code into your R script window, then run it from there.
xvals = c(50, 64, 16, 3, 73)
yvals = c(74, 99, 12, 16, 37)
cov(xvals,yvals)
In the R output, the sample covariance is the number output by the script.
Hide the Excel Code
Copy and paste the following code into your Excel spreadsheet window, making sure your cursor is in A1
when you paste.
Copy and paste the following code into your Excel spreadsheet window, making sure the value xvals
ends up in A1
after pasting.
How to calculate the expected value in Excel.
xvals  yvals   correlation 
50 
74 
cov: 
=COVARIANCE.S(A:A,B:B) 
64 
99 


16 
12 


3 
16 


73 
37 


Make sure that you begin pasting in cell A1
.