Lab 15
Confidence Intervals
|
With z-tests we learned how to decide if a sample came from a
population with a known mean. Now we are going to reverse this process.
We are going to use the sample mean to make an estimate of the
population mean that is unknown.
You may have heard of the term "margin of error" when you've read or
heard about polling data in the news. We are going to learn precisely
how these margins of error are calculated but first we are going to
discuss a related concept called confidence intervals.
If you have a sample mean and you wish to make a guess as to what the
population mean is, you can make two kinds of estimates:
1. Point estimates are guesses
that specify an exact number. When you make a point estimate using the
sample mean, it is
likely your guess is near the true population parameter but it is very
unlikely that it will be exactly the same as the parameter. For
example, you might make a point estimate that μ is 4 when really it
is 3.55.
2. Interval estimates are
guesses that specify a range of numbers. With interval estimates, you
guess that the true population parameter falls somewhere between 2
numbers. For
example, you might make an interval estimate that μ is between 2 and 6.
A confidence interval is an
interval estimate that surrounds the point estimate. Also, a confidence
interval estimates the probability that the
interval is correct. That is, how confident are you that the true
population parameter lies within your interval
estimate? For example, you might say that
you are 90% certain that μ falls between 2 and 6. This means that there
is a 10% chance that μ is lower than 2 or greater than 6. In this case,
2 is the lower bound of the
confidence interval and 6 is the upper
bound.
The distance from the point estimate to the upper bound (or lower
bound) is the margin of error.
In this case, the point estimate was 4. Thus the margin of error is 6 -
4 or 2 - 4 = ±
2.
A margin of error can be associated with any degree of confidence but
typically 90%, 95%, or 99% is chosen. When
the
margin of error's
confidence level is not specified in the news, you can usually assume
that it is a 95% margin of error.
With a
2-tailed z-test, we had 2 critical regions associated with ±
critical z. For example, if α = .05, the critical regions were
associated with critical z= ±
1.96.
To get these two critical z values, we had to divide the α of
.05 in half and split it into the bottom .025 and the top .025 portions
of the normal distribution. Using the NORMSINV function in Excel, we
see that NORMSINV(.025) = -1.96 and NORMSINV(1 - .025) = 1.96.
To calculate lower and upper bounds of a confidence interval, we are
going to use critical z values and the z-test formula to solve for μ
instead of z. With z-tests we used this formula:
With a little bit of algebraic manipulation, we can solve for μ by
multiplying both sides by the standard error, adding μ to both sides,
and subtracting ±z from both sides to get this formula:
The point estimate for μ is always the sample mean (X-bar).
The ±
z times the standard error will give us our margin
of error.
Here is the formula:

The lower bound of the confidence interval is the sample mean minus the
margin of error.
The upper bound of the confidence interval is the sample mean plus the
margin of error.
Here is a figure showing the various concepts so far.

Let's walk through an example.
Suppose we know that we have a sample of n = 25 scores with a mean 50.
The population standard deviation (σ) is 10. What is the 99% confidence
interval for μ?
First, we can calculate the standard error just as we have in the past.
The standard error is the population standard deviation divided by
square root of th sample size. In this case:
standard error = 10/sqrt(25) = 10/5 = 2
Next, we need to find the 2 critical z scores for a 99% confidence
interval. We need the middle 99% of the distribution, meaning that half
a percent (.005) is below and half a percent is above (1 - .005 =
.995). We can use the NORMSINV function in Excel to find the critical z
scores associated with these probabilities:
=NORMSINV(.005) = -2.58
=NORMSINV(.995) = 2.58
You'll notice that really you only need to look up one of these values
up and then multiply it by -1 to get the other critical z.
Plug in the sample mean, the 2 critical z scores, and the standard
error in the formula above and the confidence interval can be
calculated like this:
μ = 50 ± 2.58 * 2
μ = 50 ± 5.16
Lower bound = 50 - 5.16 = 44.84
Upper bound = 50 + 5.16 = 55.16
Thus, the point estimate for μ is 50 with a margin of error of ± 5.16.
The 99% confidence interval for μ between 44.84 and 55.16. This means
that we are 99% certain that the true mean of the population from which
the sample was drawn is somewhere between 44.84 and 55.16.
Suppose that a sample of n = 36 scores has a mean of 20. The population
standard deviation (σ) is 9.
Blackboard 1) What is
the
standard error of this sample mean? Hint: Standard error = σ / sqrt(n)
Blackboard 2) What is
the
positive critical z associated with a 90% confidence interval? Note:
Enter the positive critical z only and round to 2 decimals. Hint:
=NORMSINV((1 - .90)/2) in Excel will give the negative critical z.
Enter the absolute value of this critical z. An alternate method is to
calculate the positive critical z score directly like this: =NORMSINV(1
- (1 - .90) / 2)
Blackboard 3) What is
the
margin of error? Hint: Multiply your answer to #1 by your answer to #2.
Blackboard 4) What is
the
lower bound of the 90% confidence interval for μ? Hint: Sample mean -
margin of error
Blackboard 5) What is
the
upper bound of the 90% confidence interval for μ? Hint: Sample mean +
margin of error
Suppose that a sample of n = 81 scores has a mean of 30. The population
standard deviation (σ) is 27.
Blackboard 6) What is
the
lower bound of the 95% confidence interval? (Round to 2 decimals)
Blackboard 7) If the
99%
confidence interval for a population mean is 50 to 60, what was the
sample mean? Hint: Sample mean = the average of the upper and lower
bounds of the confidence interval
Confidence Intervals with
Proportions
The margin of error is often mentioned in the news in polling data
(e.g., opinion data and candidate preferences). For example, it might
be reported that Candidate A is ahead of Candidate B in the polls 55%
to 45% with 3% margin of error.
When Candidate A is said to have 55% support, this is merely a point
estimate from the sample. There is always some error in the point
estimate and this error can be estimated with the standard error.
Proportions (p) have a special formula for the standard error:

Other than this, the confidence interval is calculated in the same way:
μ = p ± zσp
Suppose the 55% support came from a sample of 100 people. What is the
95% confidence interval?
μ = .55 ± 1.96*sqrt(.55(1 - .55)/100)
μ = .55 ± .10
It is 95% certain that Candidate A's support falls between .45 and .65.
If the margin of error were reported, I would state, "In a sample of
100 likely voters, Candidate A appears to enjoy 55% support with a
margin of error of ± 10%."
Candidate B with 45% support turns out to have the exact same margin of
error (look at the formula for the standard error to see why) of ± 10%.
Thus,
Candidate B's support is
μ = .45 ± .10 or between 35% and 55%.
Notice that the confidence intervals of the candidates overlap (45% to
65% vs. 35% to 55%). This means that we are not confident that
Candidate A is really ahead of Candidate B. This is what is referred to
as a "statistical dead heat." Even though the sample data suggests that
Candidate A is ahead, we cannot make strong conclusions about the race.
This is same as retaining the null hypothesis. We see a difference in
the sample but it is not large enough to conclude that the difference
would be seen in the population.
Blackboard 8) Among 200 chocolate lovers, only 40%
preferred Midnight-in-a-Cave Mini Dark Chocolate Wafers over Cow &
Cacao Milk Chocolate Delights. Do the 99% confidence intervals of the 2
chocolates overlap? Hint: is the upper bound of Midnight-in-a-Cave Mini
Dark Chocolate Wafers higher than the lower bound of Cow & Cacao
Milk Chocolate Delights?
Blackboard 9) 1000 likely voters were surveyed
about how they were likely to vote in the election the next day. 53.5%
said they would vote for the incumbent. Can the incumbent be 90%
certain of victory? (Hint: Is the lower bound of the 90% confidence
interval higher than 50%?)
Blackboard 10) From the question above, what is
the lower bound of 99% confidence interval? Convert percentages to
proportions (between 0 and 1). Round to 2 decimals.
A spreadsheet tool for both kinds of confidence intervals can be found here.
|
|