Sample Correlation [Sample Statistics]

The Problem

Pearson’s sample correlation is a measure of the relationship between two numeric variables. It is used to summarize the bivariate relationship. To illustrate the calculations, let us use the following example.

A professor would like to investigate the relationship between student score on the midterm and the final exam. To do this, the professor collected the exam scores from n = 8 students. Using the data below, what is that correlation?

Here is the data. Note that there are two measurements on each unit (student). We are interested in the relationship between those two measurements.

Table of the bivariate data generated
Student	Midterm	Final
Identifier	Score	Score
1	48	55
2	82	55
3	5	5
4	60	60
5	100	32
6	67	85
7	88	83
8	97	89

Calculate the correlation of this sample.

Your Answer

In the box below, please enter the sample correlation of the data given above, then click on the “Check your answer!” button. Please round your answer to the ten-thousandths place.

Your guess:

Another Set of Data

Would you like to continue working on this topic? If so, click here for another data set.

Assistance

Show Formula

Show Solution

Hide Solution

$$ \begin{align} r &= \frac{1}{n-1}\ \sum_{i=1}^{n}\ \left(\frac{x_i - \bar{x}}{s_x}\right)\left(\frac{y_i - \bar{y}}{s_y}\right) \\[3em] &= \frac{1}{8-1}\ \sum_{i=1}^{8}\ \left(\frac{x_i - 68.375}{31.382149}\right)\left(\frac{y_i - 58}{28.839457}\right) \\[1em] &= \frac{1}{7}\ \left[ \left(\frac{x_1 - 68.375}{31.382149}\right)\left(\frac{y_1 - 58}{28.839457}\right) \right. \\ & \qquad + \left(\frac{x_2 - 68.375}{31.382149}\right)\left(\frac{y_2 - 58}{28.839457}\right) \\ & \qquad + \left(\frac{x_3 - 68.375}{31.382149}\right)\left(\frac{y_3 - 58}{28.839457}\right) \\ & \qquad + \left(\frac{x_4 - 68.375}{31.382149}\right)\left(\frac{y_4 - 58}{28.839457}\right) \\ & \qquad + \left(\frac{x_5 - 68.375}{31.382149}\right)\left(\frac{y_5 - 58}{28.839457}\right) \\ & \qquad + \left(\frac{x_6 - 68.375}{31.382149}\right)\left(\frac{y_6 - 58}{28.839457}\right) \\ & \qquad + \left(\frac{x_7 - 68.375}{31.382149}\right)\left(\frac{y_7 - 58}{28.839457}\right) \\ & \qquad + \left. \left(\frac{x_8 - 68.375}{31.382149}\right)\left(\frac{y_8 - 58}{28.839457}\right) \right] \\[1em] &= \frac{1}{7}\ \left[ \left(\frac{48 - 68.375}{31.382149}\right)\left(\frac{55 - 58}{28.839457}\right) \right. \\ & \qquad + \left(\frac{82 - 68.375}{31.382149}\right)\left(\frac{55 - 58}{28.839457}\right) \\ & \qquad + \left(\frac{5 - 68.375}{31.382149}\right)\left(\frac{5 - 58}{28.839457}\right) \\ & \qquad + \left(\frac{60 - 68.375}{31.382149}\right)\left(\frac{60 - 58}{28.839457}\right) \\ & \qquad + \left(\frac{100 - 68.375}{31.382149}\right)\left(\frac{32 - 58}{28.839457}\right) \\ & \qquad + \left(\frac{67 - 68.375}{31.382149}\right)\left(\frac{85 - 58}{28.839457}\right) \\ & \qquad + \left(\frac{88 - 68.375}{31.382149}\right)\left(\frac{83 - 58}{28.839457}\right) \\ & \qquad + \left. \left(\frac{97 - 68.375}{31.382149}\right)\left(\frac{89 - 58}{28.839457}\right) \right] \\[1em] &= \frac{1}{7}\ \left[ \left(\frac{-20.375}{31.382149}\right)\left(\frac{-3}{28.839457}\right) \right. \\ & \qquad + \left(\frac{13.625}{31.382149}\right)\left(\frac{-3}{28.839457}\right) \\ & \qquad + \left(\frac{-63.375}{31.382149}\right)\left(\frac{-53}{28.839457}\right) \\ & \qquad + \left(\frac{-8.375}{31.382149}\right)\left(\frac{2}{28.839457}\right) \\ & \qquad + \left(\frac{31.625}{31.382149}\right)\left(\frac{-26}{28.839457}\right) \\ & \qquad + \left(\frac{-1.375}{31.382149}\right)\left(\frac{27}{28.839457}\right) \\ & \qquad + \left(\frac{19.625}{31.382149}\right)\left(\frac{25}{28.839457}\right) \\ & \qquad + \left. \left(\frac{28.625}{31.382149}\right)\left(\frac{31}{28.839457}\right) \right] \\[1em] &= \frac{1}{7}\ \Big[\ \left(-0.649254\right)\left(-0.104024\right) \Big. \\ & \qquad + \left(0.434164\right)\left(-0.104024\right) \\ & \qquad + \left(-2.01946\right)\left(-1.83776\right) \\ & \qquad + \left(-0.266871\right)\left(0.069349\right) \\ & \qquad + \left(1.007739\right)\left(-0.901543\right) \\ & \qquad + \left(-0.043815\right)\left(0.936217\right) \\ & \qquad + \left(0.625356\right)\left(0.866868\right) \\ & \qquad + \Big. \left(0.912143\right)\left(1.074916\right)\ \Big] \\[1em] &= \frac{1}{7}\ \Big[\ \left(0.067538\right) \Big. \\ & \qquad + \left(-0.045164\right) \\ & \qquad + \left(3.711283\right) \\ & \qquad + \left(-0.018507\right) \\ & \qquad + \left(-0.908519\right) \\ & \qquad + \left(-0.04102\right) \\ & \qquad + \left(0.542101\right) \\ & \qquad + \Big. \left(0.980477\right)\ \Big] \\[1em] &= \frac{1}{7}\ \Big[\ 4.2882\ \Big] \\[1em] \end{align} $$

And so, the correlation between the midterm and final examination scores in this sample is r = 0.6126. As this value is positive, it indicates that those who scored better on the midterm also scored better on the final. Note, too, that we cannot (at this point) draw any conclusion about the relationship between these two variables in the population. This is only a measure of that relationship in this sample.

Show the R Code

Show the Excel Code

x-vals	y-vals		correlation
48	55	r:	=CORREL(A:A,B:B)
82	55
5	5
60	60
100	32
67	85
88	83
97	89

© Ole J. Forsberg, Ph.D. 2025. All rights reserved.		.

The Sample Correlation

The Problem

Your Answer

Another Set of Data

Assistance