Table of Contents
Understanding the Normal Distribution
The normal distribution is one of the most important continuous probability distributions in statistics. Many natural and human-made measurements are approximately normally distributed, especially when they result from the sum of many small, random influences (for example, heights, test scores, measurement errors).
In this chapter, we focus specifically on what makes a distribution “normal,” how it is described, and how it is used in practice.
Shape and Key Features
A normal distribution has a very distinctive shape:
- It is bell-shaped and symmetric about its center.
- It has a single peak (unimodal).
- Most values cluster around the center, and the probability of extreme values decreases smoothly as you move away from the center.
Two parameters completely describe a normal distribution:
- The mean $ \mu $ (center of the distribution).
- The standard deviation $ \sigma $ (spread of the distribution, with $ \sigma > 0 $).
If a random variable $X$ is normally distributed with mean $\mu$ and standard deviation $\sigma$, we write
$$
X \sim N(\mu, \sigma^2).
$$
Here $ \sigma^2 $ is the variance.
Symmetry and the Mean/Median/Mode
For a normal distribution:
- The mean, median, and mode all coincide at the same value (the center).
- Because of symmetry:
- Half of the probability is on each side of the mean, so
$$
P(X \le \mu) = P(X \ge \mu) = 0.5.
$$
The Probability Density Function (PDF)
The normal distribution is continuous, so it is described by a probability density function (PDF), not by a list of discrete probabilities.
For $X \sim N(\mu, \sigma^2)$, the PDF is
$$
f(x) = \frac{1}{\sigma \sqrt{2\pi}} \exp\!\left( -\frac{(x - \mu)^2}{2\sigma^2} \right).
$$
Key points:
- $f(x)$ is always positive but never zero; the curve extends to both $-\infty$ and $+\infty$.
- The total area under the curve is 1:
$$
\int_{-\infty}^{\infty} f(x)\,dx = 1.
$$ - Probabilities are areas under the curve. For example,
$$
P(a \le X \le b) = \int_a^b f(x)\,dx.
$$
There is no simple formula for these integrals, so in practice we use:
- Normal distribution tables,
- Calculators or computer software,
- Or the special case of the standard normal distribution (see next section) and the $z$-score.
The Standard Normal Distribution
A particularly important special case is the standard normal distribution, which has mean 0 and standard deviation 1:
$$
Z \sim N(0, 1).
$$
Its PDF is
$$
\phi(z) = \frac{1}{\sqrt{2\pi}} \exp\!\left( -\frac{z^2}{2} \right).
$$
The symbol $ \phi(z) $ is often used for this PDF. The cumulative distribution function (CDF) of the standard normal, usually written as $ \Phi(z) $, is
$$
\Phi(z) = P(Z \le z).
$$
Standard normal tables give values of $ \Phi(z) $ for many $z$ values. Software and calculators can compute $ \Phi(z) $ directly.
The $z$-Score and Standardization
Any normal random variable $X \sim N(\mu, \sigma^2)$ can be converted into a standard normal variable $Z$ by the transformation
$$
Z = \frac{X - \mu}{\sigma}.
$$
This process is called standardization.
The quantity
$$
z = \frac{x - \mu}{\sigma}
$$
for a particular observed value $x$ is called the $z$-score of $x$. It tells you how many standard deviations $x$ is above or below the mean:
- $z > 0$: $x$ is above the mean.
- $z < 0$: $x$ is below the mean.
- $|z|$ large: $x$ is far from the mean (an “unusual” value, depending on context).
Because of standardization, probabilities involving any normal variable $X$ can be turned into probabilities involving $Z$:
- To find $P(X \le x)$:
- Compute $z = (x - \mu)/\sigma$.
- Look up $P(Z \le z) = \Phi(z)$.
- To find $P(a \le X \le b)$:
- Compute $z_a = (a - \mu)/\sigma$, $z_b = (b - \mu)/\sigma$.
- Then
$$
P(a \le X \le b) = P(z_a \le Z \le z_b) = \Phi(z_b) - \Phi(z_a).
$$
The Empirical Rule (68–95–99.7 Rule)
For any normal distribution $N(\mu, \sigma^2)$, approximate proportions of the data lie within certain distances of the mean:
- About 68% of values lie within 1 standard deviation of the mean:
$$
P(\mu - \sigma \le X \le \mu + \sigma) \approx 0.68.
$$ - About 95% of values lie within 2 standard deviations:
$$
P(\mu - 2\sigma \le X \le \mu + 2\sigma) \approx 0.95.
$$ - About 99.7% of values lie within 3 standard deviations:
$$
P(\mu - 3\sigma \le X \le \mu + 3\sigma) \approx 0.997.
$$
This pattern is sometimes called the 68–95–99.7 rule or the empirical rule. It gives a quick way to judge how unusual a data point is:
- Values more than about 2 standard deviations from the mean are relatively rare.
- Values more than about 3 standard deviations from the mean are very rare.
Using the Normal Distribution in Practice
Here are typical ways the normal distribution is used:
Approximating Real-World Measurements
Many measurements or test scores are modeled as normal distributions. Once you assume $X \sim N(\mu, \sigma^2)$, you can:
- Estimate the probability that a random individual’s value lies in a certain range.
- Determine thresholds, such as “top 10%” or “bottom 5%”, using $z$-scores and normal tables.
Finding Percentiles and Cutoffs
The $p$th percentile of a normal distribution is the value $x_p$ such that
$$
P(X \le x_p) = p.
$$
Using standardization:
- Find $z_p$ so that $\Phi(z_p) = p$ (from tables or software).
- Convert back: $x_p = \mu + z_p \sigma$.
This is how you find, for example, the score that marks the top 5% of a normally distributed test.
Converting Between Raw Scores and $z$-Scores
- Given an observation $x$, its $z$-score is
$$
z = \frac{x - \mu}{\sigma}.
$$ - Given a $z$-score, the corresponding value on the original scale is
$$
x = \mu + z\sigma.
$$
This allows you to compare results from different normal distributions on a common scale (the $z$-scale).
Normal Approximation to Other Distributions (Idea Only)
The normal distribution is often used to approximate other distributions in certain conditions, especially when sample sizes are large.
One important case is when a sum (or average) of many small, independent random effects is involved. Under suitable conditions, such sums are approximately normal. This idea is formalized in the Central Limit Theorem, which is treated elsewhere, but it helps explain why the normal distribution appears so frequently.
Limitations and Cautions
While the normal distribution is very useful, it is not always appropriate:
- Data that are highly skewed or have many extreme values (outliers) may not be well modeled by a normal distribution.
- Some variables cannot be negative, but a normal distribution allows all real numbers.
- Modeling should always be checked against actual data (for example, using histograms or other diagnostic plots) before assuming normality.
Despite these limitations, the normal distribution remains a central tool in probability and statistics, especially when working with continuous data, $z$-scores, and approximate probabilities.