Table of Contents
Standard deviation is a number that tells you how spread out a set of data values is around its mean (average). In this chapter, we focus on understanding what standard deviation measures, how to compute it in simple cases, and how to interpret it in context.
Because this chapter is part of Descriptive Statistics, we will assume you already know what a mean is and have a basic idea of what “spread” or “variability” means from the chapter on variance. Here, we connect that idea to a more practical and commonly used measure: standard deviation.
Why standard deviation is useful
When you have a data set, two questions often matter:
- Where are the data values centered? (This is measured by the mean or other “center” measures.)
- How spread out are the data values? (This is measured by things like range, variance, and standard deviation.)
Range (largest minus smallest) uses only two data points. Variance uses all the data points but is expressed in “squared units,” which can feel unnatural. Standard deviation fixes this by being:
- Based on all data values (like variance),
- In the same units as the original data (unlike variance, which uses squared units).
For example, if you measure test scores in points, the standard deviation is also in points. If you measure height in centimeters, the standard deviation is in centimeters.
Relationship between variance and standard deviation
Variance measures average squared distance from the mean. Standard deviation is simply the square root of the variance.
If the variance is $s^2$, the standard deviation is
$$
s = \sqrt{s^2}.
$$
So:
- Large variance → large standard deviation → data are widely spread.
- Small variance → small standard deviation → data are tightly clustered around the mean.
In practice, people usually talk about standard deviation more often than variance because its units are easier to interpret.
Population vs. sample standard deviation
There are two common situations:
- You have data for every member of a group you care about. This is a population.
- You only have data from some members of a larger group. This is a sample.
The formulas are slightly different in the denominator, just as for variance.
Population standard deviation
Suppose your population has $N$ values:
$$
x_1, x_2, \dots, x_N
$$
and the population mean is
$$
\mu = \frac{1}{N}\sum_{i=1}^{N} x_i.
$$
The population variance is
$$
\sigma^2 = \frac{1}{N}\sum_{i=1}^{N}(x_i - \mu)^2,
$$
and the population standard deviation is
$$
\sigma = \sqrt{\sigma^2}
= \sqrt{\frac{1}{N}\sum_{i=1}^{N}(x_i - \mu)^2}.
$$
Here the symbol $\sigma$ (Greek letter “sigma”) is used for the population standard deviation.
Sample standard deviation
Now suppose you only have a sample of $n$ values from a larger population:
$$
x_1, x_2, \dots, x_n,
$$
with sample mean
$$
\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i.
$$
The sample variance is
$$
s^2 = \frac{1}{n - 1}\sum_{i=1}^{n}(x_i - \bar{x})^2,
$$
and the sample standard deviation is
$$
s = \sqrt{s^2}
= \sqrt{\frac{1}{n - 1}\sum_{i=1}^{n}(x_i - \bar{x})^2}.
$$
Notice the denominator $n - 1$ instead of $n$. This adjustment is used when you are using the sample to estimate variability in a larger population. The important practical point: when working with a sample, use $n - 1$ in the denominator.
Step-by-step computation (sample standard deviation)
To make the process concrete, here is the step-by-step procedure for a sample:
- List your data values.
- Compute the sample mean $\bar{x}$.
- Subtract the mean from each value to get deviations:
$x_i - \bar{x}$. - Square each deviation: $(x_i - \bar{x})^2$.
- Sum the squared deviations:
$\sum_{i=1}^{n}(x_i - \bar{x})^2$. - Divide by $n - 1$ to get the sample variance $s^2$.
- Take the square root of the variance to get $s$, the standard deviation.
These same steps apply to the population standard deviation, with the only change being that you divide by $N$ instead of $n - 1$ in step 6.
Interpreting the size of the standard deviation
The standard deviation tells you how tightly or loosely the data are grouped around the mean.
- Small standard deviation:
- Data values are close to the mean.
- The distribution is “narrow” and “tall” if you draw a histogram.
- Example: Heights of students in a single classroom might have a small standard deviation if everyone is about the same height.
- Large standard deviation:
- Data values are spread farther from the mean.
- The distribution is “wide” and more “flat.”
- Example: Ages of people attending a public event might have a large standard deviation if there are children, adults, and elderly people.
The actual number has to be interpreted in the context of the units and the mean. For example:
- A standard deviation of $2$ points on a test scored out of $10$ is fairly large.
- A standard deviation of $2$ points on a test scored out of $100$ is relatively small.
Standard deviation and distance from the mean
Standard deviation gives a rough sense of how far a “typical” data value lies from the mean, but it is not exactly the average distance. It’s the square root of the average squared distance.
However, for many kinds of data (especially when the distribution is roughly bell-shaped), a useful rule of thumb is:
- Many values lie within 1 standard deviation of the mean.
- A large majority lie within 2 standard deviations of the mean.
- Very few lie more than 3 standard deviations from the mean.
The exact percentages depend on the shape of the distribution and are explored more formally in other chapters (particularly when you study the normal distribution).
Standard deviation in grouped data (optional idea)
Often data are summarized in frequency tables, where exact raw values are not all listed, but counts (frequencies) of each value or each class are given. In such cases:
- You estimate the mean using each value (or class midpoint) multiplied by its frequency.
- You estimate the variance and standard deviation using the same formulas, but each term $(x_i - \bar{x})^2$ is multiplied by its frequency.
The underlying idea does not change: standard deviation is still the square root of the variance, and it still measures spread around the mean.
Comparing variability with standard deviations
Standard deviation is especially helpful for comparing how variable two data sets are, even if they have similar means.
Suppose Data Set A and Data Set B both have a mean of $50$:
- If Data Set A has a standard deviation of $3$ and Data Set B has a standard deviation of $12$, then:
- A is fairly consistent (values mostly close to $50$),
- B is much more variable (values more widely scattered around $50$).
This kind of comparison is common in many fields:
- Education (comparing consistency of test scores),
- Manufacturing (measuring consistency of product dimensions),
- Finance (measuring how volatile investment returns are),
- And many others.
Practical notes
- Units: The standard deviation uses the same units as the original data, making it easier to interpret than variance.
- Non-negative: Standard deviation is never negative. The smallest possible value is $0$ (when all data values are exactly the same).
- Sensitive to outliers: Because standard deviation is based on squared distances, extreme values (outliers) can greatly increase it.
Understanding standard deviation prepares you for later chapters where:
- Distributions such as the normal distribution are described using a mean and standard deviation.
- You quantify uncertainty and risk in probability and statistics.
- You construct confidence intervals and perform hypothesis tests in inferential statistics.