The Problem
Example # 261: Estimate the success rate for the population using a confidence interval. To estimate this difference, we collect data. The data are a series of “Success” and “Failure” values. For our sample, the data are
“Failure”, “Success”, “Failure”, “Success”, “Success”, “Success”, “Failure”, “Success”, “Success”, “Failure”, “Failure”, “Failure”, “Failure”, “Failure”, “Failure”, “Success”, “Success”, “Failure”, “Failure”, “Success”, “Failure”, “Success”, “Success”, “Failure”, “Failure”, “Failure”, “Failure”, “Failure”, “Success”, “Success”, “Failure”, “Success”, “Failure”, “Success”, “Success”, “Success”, “Failure”, “Failure”, “Failure”, “Failure”
With this information, calculate the endpoints of the symmetric 98% confidence interval.
Information given:
To summarize the above, the values of import are:
Summary statistics from the problem
\( x \)
| = |
17 |
\( n \)
| = |
40 |
\( \hat{p} \)
| = |
0.425 |
\( \alpha \)
| = |
0.02 |
Note that there is no value given for p0. This is because confidence intervals are based solely on the data, and not on any hypothesized values.
It may be helpful if you calculate these values yourself. Once you have, you can check your answers by hovering your mouse over the grey spaces to see if you calculated them correctly.
Your Answer
You got the correct endpoints to the 98% confidence: (0.2429, 0.6071). Congratulations!
Unfortunately, your answer was not correct. Either try again or click on “Show Solution” below to see how to obtain the correct answer.
Assistance
Hide Solution
$$ \begin{align}
\text{Confidence Limits} &= \hat{p} \pm Z(\alpha/2) \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \\[3em]
&= 0.425 \pm Z(0.02/2) \sqrt{\frac{ 0.425 \left(1- 0.425\right)}{ 40}} \\[1em]
&= 0.425 \pm \left(2.33\right) \sqrt{\frac{ 0.2444}{ 40}} \\[1em]
&= 0.425 \pm \left(2.33\right) \sqrt{ 0.0061} \\[1em]
&= 0.425 \pm \left(2.33\right) 0.078162 \\[1em]
&= 0.425 \pm 0.182119 \end{align}
$$
Thus, we are 98% confident that the success rate in the population is between 0.2429 and 0.6071.
Note that 0.182119 is the margin of error, which is usually symbolized as E. So, for a sample like this of size n = 40, polling companies will report the results as “42.5% plus or minus 18.2 points.” As you may expect, larger sample sizes will produce smaller margins of error.
Hide the R Code
This formulation of the confidence interval is pedagogically simple to understand. That is why it is used in introductory textbooks. It is actually based on the Normal approximation to the Binomial distribution. There are several improvements to the test. For these reasons, R does not have a built-in function to calculate this version o fthe confidence interval. The following code echoes the above calculations to provide the endpoints of the confidence interval.
Copy and paste the following code into your R script window, then run it from there.
sample = c("Failure", "Success", "Failure", "Success", "Success", "Success", "Failure", "Success", "Success", "Failure", "Failure", "Failure", "Failure", "Failure", "Failure", "Success", "Success", "Failure", "Failure", "Success", "Failure", "Success", "Success", "Failure", "Failure", "Failure", "Failure", "Failure", "Success", "Success", "Failure", "Success", "Failure", "Success", "Success", "Success", "Failure", "Failure", "Failure", "Failure")
x = sum(sample=="Success")
n = length(sample)
phat = x/n
alpha = 0.02
se2 = phat*(1-phat)/n
lcl = phat + qnorm(alpha/2)*sqrt(se2)
ucl = phat - qnorm(alpha/2)*sqrt(se2)
lcl; ucl
In the R output, the endpoints of the confidence interval are the numbers output after running the last line. Note that R will give you calculations that are more accurate and more precise than doing the calculations by hand.
Hide the Excel Code
This formulation of the confidence interval is pedagogically simple to understand. That is why it is used in introductory textbooks. It is actually based on the Normal approximation to the Binomial distribution. There are several improvements to the test. For such reasons, Excel does not have a built-in function to perform these calculations. The following code echoes the above calculations to provide the confidence interval.
Copy and paste the following code into your Excel window, making sure the value sample
ends up in A1
after pasting.
How to calculate the test statistic in Excel.
sample | | | |
Failure |
|
alpha: |
0.02 |
Success |
|
x: |
=COUNTIF(A:A,"Success") |
Failure |
|
n: |
=COUNTIF(A:A,"Success")+COUNTIF(A:A,"Failure") |
Success |
|
p-hat: |
=D3/D4 |
Success |
|
lcl: |
=D5+NORM.S.INV(D2/2)*sqrt(D5*(1-D5)/D4) |
Success |
|
ucl: |
=D5-NORM.S.INV(D2/2)*sqrt(D5*(1-D5)/D4) |
Failure |
|
|
|
Success |
|
|
|
Success |
|
|
|
Failure |
|
|
|
Failure |
|
|
|
Failure |
|
|
|
Failure |
|
|
|
Failure |
|
|
|
Failure |
|
|
|
Success |
|
|
|
Success |
|
|
|
Failure |
|
|
|
Failure |
|
|
|
Success |
|
|
|
Failure |
|
|
|
Success |
|
|
|
Success |
|
|
|
Failure |
|
|
|
Failure |
|
|
|
Failure |
|
|
|
Failure |
|
|
|
Failure |
|
|
|
Success |
|
|
|
Success |
|
|
|
Failure |
|
|
|
Success |
|
|
|
Failure |
|
|
|
Success |
|
|
|
Success |
|
|
|
Success |
|
|
|
Failure |
|
|
|
Failure |
|
|
|
Failure |
|
|
|
Failure |
|
|
|
The endpoints of the 98% confidence interval are the numbers calculated in cells D5 and D6. Again, when you paste this code into Excel, make sure that you start the pasting in cell A1. To help with that, you may want to also copy this notice. It seems to help.