Estimating the Mean Square Error, MSE

In linear regression, we are modeling the dependent variable using this model:

Y = β₀ + β₁X + ε

Here, Y is the dependent variable, X is the independent variable, β₀ is the expected value of Y when X = 0 in the population, β₁ is the effect of X on Y in the population, and ε is random variation unexplained by the model.

To perform statistical inference, we make the usual assumption that

ε ~ Normal(0, σ²)

The mean square error is an estimate of that σ². It is also a measure of the variance unexplained by the model and data. There are different methods for estimating this value. Ordinary least squares is one method. Its strengths are that it is easy to perform, it is exact, and that it is straight forward to create the estimators.

The Problem

Example #275: Let us model (explain) the weight of a leaf using its thickness. To explore this, we collect data. The data consist of two measurements on each unit (leaf): thickness and weight. Thus, our data are

Data table
Leaf Number	Thickness [μm]	Weight [g]
1	4.6	10.5
2	6.2	12.1
3	5.4	13.4
4	7.4	13.2
5	6.2	10.6
6	3.9	13

With this information, we estimated the linear model to be

weight = 11.502 + (0.1124) thickness

What is the mean square error?

Information given:

To summarize the above, the values of import are:

Summary statistics from the problem
$\bar{x}$	=	5.6167
$\bar{y}$	=	12.1333
$b_0$	=	11.502
$b_1$	=	0.1124

Note that $b_0$ is important in estimating the y-intercept. If you are unsure how to calculate it, or if you would like more practice doing so, please see the OLS estimate of the y-intercept tutorial.

Note also that $b_1$ is important in estimating the mean square error. If you are unsure how to calculate it, or if you would like more practice doing so, please see the OLS estimate of the slope tutorial.

Your Answer

In the box below, please enter the mean square error of the model and data, then click on the “Check your answer!” button. Please round your answer to the ten-thousandths place.

Your guess:

Another Set of Data

Would you like to continue working on this topic? If so, click here for another data set.

Assistance

Show Formula

Hide Formula

Here is the formula you can use to calculate the OLS estimate of the effect.

$MSE = \frac{1}{n-2}\ \sum_{i=1}^n (y_i - \hat{y}_i)^2 = \frac{1}{n-2}\ \sum_{i=1}^n (y_i - b_0 - b_1 x_i)^2$

In this formula, there are a few symbols to know:

$n$

the sample size

$\hat{y}_i$

the predicted value of $y_i$ given $x_i$

$y_i$

the the i^th observed value of the dependent variable

$x_i$

the the i^th observed value of the independent variable

$b_0$

the OLS estimate of the y-intercept

$b_1$

the OLS estimate of the slope

$\sum$

summing everything in the expression to the right

Show Solution

Hide Solution

$\begin{align} MSE &= \frac{1}{n-2}\ \sum_{i=1}^n (y_i - b_0 - b_1 x_i)^2 \\[3em] &= \frac{1}{6-2}\ \sum_{i=1}^6 \Big(y_i - 11.502 - (0.1124) x_i\Big)^2 \\[1em] &= \frac{1}{4}\ \sum_{i=1}^6 \Big(y_i - 11.502 - (0.1124) x_i\Big)^2 \\[1em] &= \frac{1}{4}\ \Big[ \Big(10.5 - 11.502 - (0.1124)4.6\Big)^2 + \Big(12.1 - 11.502 - (0.1124)6.2\Big)^2 + \Big(13.4 - 11.502 - (0.1124)5.4\Big)^2 + \Big(13.2 - 11.502 - (0.1124)7.4\Big)^2 + \Big(10.6 - 11.502 - (0.1124)6.2\Big)^2 + \Big(13 - 11.502 - (0.1124)3.9\Big)^2 \Big] \\[1em] &= \frac{1}{4}\ \Big[ \Big(10.5 - 11.502 - (0.5171\Big)^2 + \Big(12.1 - 11.502 - (0.6969\Big)^2 + \Big(13.4 - 11.502 - (0.607\Big)^2 + \Big(13.2 - 11.502 - (0.8318\Big)^2 + \Big(10.6 - 11.502 - (0.6969\Big)^2 + \Big(13 - 11.502 - (0.4384\Big)^2 \Big] \\[1em] &= \frac{1}{4}\ \Big[ \Big(-1.5191\Big)^2 + \Big(-0.0989\Big)^2 + \Big(1.291\Big)^2 + \Big(0.8662\Big)^2 + \Big(-1.5989\Big)^2 + \Big(1.0596\Big)^2 \Big] \\[1em] &= \frac{1}{4}\ \Big[ \Big(2.3075\Big) + \Big(0.0098\Big) + \Big(1.6667\Big) + \Big(0.7503\Big) + \Big(2.5565\Big) + \Big(1.1228\Big) \Big] \\[1em] &= \frac{1}{4}\ 8.4137 \\[1em] &= 2.1034 \\[1em] \end{align}$

For these data, the mean squared error is MSE = 2.1034. This is the point estimate for the value of σ² in the model.

Show the R Code

© Ole J. Forsberg, Ph.D. 2025. All rights reserved.		.