Table of Contents
A confidence interval is a way to use sample data to give a range of plausible values for an unknown population parameter, such as a mean or a proportion. In inferential statistics, this is one of the two main tasks (the other is hypothesis testing): instead of just giving a single estimate, we report an interval together with a confidence level.
In this chapter, we focus on what is specific to confidence intervals: what they mean, how they are built in simple common cases, and how to interpret them correctly.
The idea of a confidence interval
Suppose you want to estimate the average height of all students in a large school (the population mean), but you only measure the heights of a small group (a sample). From the sample, you can compute:
- a point estimate (for example, the sample mean $\bar{x}$), and
- a confidence interval (an interval around $\bar{x}$ meant to capture the true population mean $\mu$ with some chosen level of confidence, like $95\%$).
A typical confidence interval has the form
$$
\text{estimate} \;\pm\; \text{margin of error}.
$$
For a mean, you might see:
$$
\bar{x} \;\pm\; \text{(critical value)} \times \text{(standard error)}.
$$
The standard error measures the typical size of sampling fluctuations in your estimate. The critical value depends on the confidence level (for example, $95\%$) and on the sampling distribution (like the normal or $t$-distribution).
The result is a range:
$$
[\text{lower bound},\ \text{upper bound}],
$$
which we report along with the confidence level.
Confidence level and its meaning
A confidence level (such as $90\%$, $95\%$, or $99\%$) describes the long-run performance of the method, not a probability that the specific interval you computed is correct.
Imagine you repeatedly:
- Draw a random sample from the same population.
- Compute a confidence interval using the same procedure and same confidence level.
Then:
- About $95\%$ of all $95\%$ confidence intervals would contain the true parameter.
- About $90\%$ of all $90\%$ confidence intervals would contain the true parameter, and so on.
Once you have one particular interval from your data, the true parameter is either inside it or not; there is no randomness left in the parameter. The confidence level refers to the reliability of the procedure, not to the probability of this one specific interval.
A common, but technically incorrect, wording is:
- “There is a $95\%$ chance that the true mean lies in this interval.”
A better wording is: - “We are $95\%$ confident that this interval contains the true mean.”
or - “This interval was constructed using a method that captures the true mean $95\%$ of the time in repeated sampling.”
Structure of a confidence interval
Most basic confidence intervals follow this pattern:
$$
\text{estimate} \;\pm\; (\text{critical value}) \times (\text{standard error}).
$$
- Estimate: sample statistic used to estimate the population parameter (for example, sample mean $\bar{x}$ or sample proportion $\hat{p}$).
- Standard error: an estimate of how much the statistic varies from sample to sample.
- Critical value: a multiplier chosen so that the resulting interval has the desired confidence level, based on the sampling distribution (e.g., a $z$-score or a $t$-score).
The more variable your estimate is (larger standard error), the wider your interval. A higher confidence level also leads to a larger critical value, and therefore a wider interval.
Confidence interval for a population mean (large-sample / known $\sigma$ case)
We start with a basic, idealized case that illustrates the structure clearly. Assume:
- You are estimating a population mean $\mu$.
- The population standard deviation $\sigma$ is known.
- You have a random sample of size $n$.
- $n$ is reasonably large, or the population is normally distributed, so that the sampling distribution of $\bar{x}$ is approximately normal.
Then the standard error of the sample mean is
$$
\text{SE}(\bar{x}) = \frac{\sigma}{\sqrt{n}}.
$$
For a confidence level such as $95\%$, there is a corresponding $z$-critical value $z^\*$ so that:
- about $95\%$ of the time, a standard normal random variable falls between $-z^\$ and $+z^\$.
Common $z^\*$ values:
- $90\%$ confidence: $z^\* \approx 1.645$
- $95\%$ confidence: $z^\* \approx 1.96$
- $99\%$ confidence: $z^\* \approx 2.576$
The confidence interval for $\mu$ is then
$$
\bar{x} \;\pm\; z^\* \cdot \frac{\sigma}{\sqrt{n}}.
$$
The two endpoints are:
- Lower bound: $\bar{x} - z^\* \dfrac{\sigma}{\sqrt{n}}$
- Upper bound: $\bar{x} + z^\* \dfrac{\sigma}{\sqrt{n}}$
This formula shows:
- Larger $n$ makes $\dfrac{\sigma}{\sqrt{n}}$ smaller, so the interval gets narrower.
- Larger $z^\*$ (higher confidence level) makes the interval wider.
In practice, $\sigma$ is usually unknown; then we typically replace it with the sample standard deviation and use a $t$-distribution. Detailed work with the $t$-distribution belongs to another chapter on distributions; here we focus on the idea that the critical value changes depending on the distribution used.
Confidence interval for a population proportion
Now consider estimating a population proportion $p$ (for example, the proportion of people in a city who support a certain policy). Suppose:
- We draw a random sample of size $n$.
- Each individual is classified as “success” or “failure.”
- We observe $X$ successes in the sample.
- The sample proportion is
$$
\hat{p} = \frac{X}{n}.
$$
For a sufficiently large sample size (so that the sampling distribution of $\hat{p}$ is approximately normal), the standard error of $\hat{p}$ is
$$
\text{SE}(\hat{p}) \approx \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}}.
$$
For a confidence level such as $95\%$, we again use a $z$-critical value $z^\*$, as in the mean case. The confidence interval for $p$ is
$$
\hat{p} \;\pm\; z^\* \cdot \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}}.
$$
As before:
- The interval narrows as $n$ increases, because the standard error decreases.
- Higher confidence levels (larger $z^\*$) make the interval wider.
This is one of the most common types of confidence intervals in introductory statistics, especially in survey results.
How sample size and confidence level affect width
Three key factors influence the width of a confidence interval:
- Sample size $n$
The standard error usually contains a factor $\dfrac{1}{\sqrt{n}}$, so: - Larger $n$ → smaller standard error → narrower interval.
- Smaller $n$ → larger standard error → wider interval.
- Confidence level
The critical value (such as $z^\*$) increases with the confidence level: - Higher confidence (e.g., $99\%$ instead of $95\%$) → larger critical value → wider interval.
- Lower confidence (e.g., $90\%$ instead of $95\%$) → smaller critical value → narrower interval.
There is a trade-off: more confidence means less precision (wider interval), and more precision (narrower interval) means less confidence.
- Variability in the data
The population standard deviation $\sigma$, or the estimated variability (like $\hat{p}(1-\hat{p})$ for proportions), affects the standard error: - More variability → larger standard error → wider interval.
- Less variability → smaller standard error → narrower interval.
These relationships are crucial when planning studies: to achieve a certain margin of error at a chosen confidence level, you can determine what sample size you need.
Assumptions behind confidence intervals
Confidence intervals are not magic; they rely on assumptions about how the data were collected and what distributions are appropriate. Typical assumptions for simple intervals are:
- The sample is drawn randomly from the population (or is at least representative).
- Individual observations are independent of each other.
- For mean intervals using normal or $t$-based methods:
- Either the population is roughly normal, or
- The sample size $n$ is large enough that the sampling distribution of the mean is approximately normal.
- For proportion intervals using $z$-based methods:
- The sample size is large enough that the sampling distribution of $\hat{p}$ is approximately normal (often checked by requiring that the expected numbers of successes and failures, $n\hat{p}$ and $n(1-\hat{p})$, are not too small).
If these assumptions are badly violated, the advertised confidence level (e.g., $95\%$) might no longer be accurate.
Correct and incorrect interpretations
Interpreting confidence intervals accurately is essential. Some common points:
- A $95\%$ confidence interval does not mean:
- “There is a $95\%$ probability that the true parameter is in our particular interval.”
- It does mean:
- “We used a method that, in repeated random samples, would produce intervals that capture the true parameter $95\%$ of the time. Based on this method, we are $95\%$ confident that this interval contains the true parameter.”
Some further interpretive points:
- If two $95\%$ confidence intervals for two groups’ means (or proportions) barely overlap or do not overlap, this often suggests a meaningful difference between the groups. However, an exact test of difference belongs to hypothesis testing, not to the confidence interval itself.
- A very wide interval often indicates:
- A small sample size, or
- High variability in the data.
- A very narrow interval indicates:
- A large sample size and/or
- Low variability, giving a more precise estimate.
Confidence intervals should be reported alongside the point estimate, because they convey the uncertainty as well as the central value.
Using confidence intervals in practice
In real applications, confidence intervals are used to express uncertainty in many settings:
- Opinion polls: reporting the proportion of respondents who support a policy, with a margin of error.
- Medical studies: estimating average treatment effects, such as the difference in mean blood pressure between two treatments.
- Quality control: estimating the average weight, length, or some other measurement of manufactured items.
When reporting results, it is common to write something like:
- “The estimated mean is $50.2$ units, with a $95\%$ confidence interval from $48.7$ to $51.7$.”
- “$63\%$ of respondents supported the proposal (95% confidence interval: $59\%$ to $67\%$).”
Such statements indicate both the estimate and its precision, helping others evaluate how reliable the estimate is.
Summary
In this chapter, we have focused on what is distinctive to confidence intervals in inferential statistics:
- A confidence interval gives a range of plausible values for a population parameter, based on a sample.
- It has the general form
$$
\text{estimate} \pm \text{(critical value)} \times \text{(standard error)}.
$$ - The confidence level refers to the long-run performance of the method, not the probability that the particular interval you computed is correct.
- For means and proportions, common basic forms use normal or related distributions to obtain critical values.
- The width of a confidence interval depends on sample size, confidence level, and data variability.
- Correct interpretation emphasizes uncertainty and reliability, not absolute guarantees.
More advanced chapters can extend these ideas to other parameters, more complicated data structures, and alternative interval-construction methods.