Table of Contents
Variance is a numerical measure of how spread out a set of values is around its mean. While the mean tells you a “central” value, variance tells you how much the values typically deviate from that center.
In this chapter we focus on:
- What variance measures conceptually.
- How to calculate variance for a population and for a sample.
- How variance relates to standard deviation.
- How to interpret variance in practical terms.
- Some basic properties of variance.
The ideas of data, mean, and standard deviation are assumed from other chapters in Descriptive Statistics; here we concentrate on variance itself.
What variance measures
Suppose you have a collection of numbers (a data set):
$$
x_1, x_2, x_3, \dots, x_n
$$
and their mean (average) is $\bar{x}$ (or $\mu$ for a population).
Variance measures how far, on average, the values $x_i$ are from the mean, but with a key twist:
- It looks at the squared deviations from the mean:
$$
(x_1 - \bar{x})^2,\ (x_2 - \bar{x})^2,\ \dots,\ (x_n - \bar{x})^2
$$
- Then it takes an average of these squared deviations (with a small difference between population and sample, which we will explain).
Because the deviations are squared:
- Values far from the mean contribute much more to the variance.
- The variance is always non-negative (never less than 0).
- The unit of variance is the square of the original unit (for example, if data are in meters, variance is in square meters).
In words:
- Large variance ⇒ data are widely spread around the mean.
- Small variance ⇒ data are tightly clustered near the mean.
- Zero variance ⇒ all data values are exactly the same.
Population variance
When your data represent an entire population (not just a subset), the population variance is usually denoted by $\sigma^2$ (the Greek letter sigma squared).
Let a population have $N$ values:
$$
x_1, x_2, \dots, x_N
$$
with population mean
$$
\mu = \frac{1}{N}\sum_{i=1}^N x_i.
$$
The population variance is defined as
$$
\sigma^2 = \frac{1}{N}\sum_{i=1}^N (x_i - \mu)^2.
$$
Step by step:
- Compute the population mean $\mu$.
- For each value $x_i$, compute its deviation from the mean: $x_i - \mu$.
- Square each deviation: $(x_i - \mu)^2$.
- Add up all the squared deviations: $\sum_{i=1}^N (x_i - \mu)^2$.
- Divide by $N$.
Because you divide by $N$, this is exactly the average squared deviation from the mean for the whole population.
Example: population variance
Suppose the (tiny) population of exam scores is:
$$
60,\ 70,\ 80,\ 90
$$
- Mean:
$$
\mu = \frac{60 + 70 + 80 + 90}{4} = \frac{300}{4} = 75.
$$ - Deviations from the mean:
- $60 - 75 = -15$
- $70 - 75 = -5$
- $80 - 75 = 5$
- $90 - 75 = 15$
- Squared deviations:
- $(-15)^2 = 225$
- $(-5)^2 = 25$
- $5^2 = 25$
- $15^2 = 225$
- Sum of squared deviations:
$$
225 + 25 + 25 + 225 = 500.
$$ - Divide by $N = 4$:
$$
\sigma^2 = \frac{500}{4} = 125.
$$
So the population variance is $125$ (in “points squared”).
Sample variance
In practice, you rarely have access to an entire population. Instead, you often have a sample drawn from that population. When you calculate variance from a sample and want to use it as an estimate of the population variance, you use a slightly different formula.
Let your sample have $n$ values:
$$
x_1, x_2, \dots, x_n
$$
with sample mean
$$
\bar{x} = \frac{1}{n}\sum_{i=1}^n x_i.
$$
The sample variance is usually denoted by $s^2$ and defined as
$$
s^2 = \frac{1}{n - 1}\sum_{i=1}^n (x_i - \bar{x})^2.
$$
The only visible difference from the population formula is the denominator: $n - 1$ instead of $n$.
Why $n - 1$?
- Intuitively, when you estimate from a sample, using $\bar{x}$ (instead of the true $\mu$) tends to underestimate the variability a bit.
- Dividing by $n-1$ instead of $n$ “corrects” this underestimation, making $s^2$ an unbiased estimator of the population variance under typical assumptions.
You do not need the full proof here; just remember:
- Use denominator $N$ for a full population (formula for $\sigma^2$).
- Use denominator $n - 1$ for a sample, when estimating a population variance (formula for $s^2$).
Example: sample variance
Take the same four scores, but now treat them as a sample, not a population:
$$
60,\ 70,\ 80,\ 90
$$
The sample mean $\bar{x}$ is the same as before:
$$
\bar{x} = 75.
$$
We reuse the squared deviations we already computed:
- $(60 - 75)^2 = 225$
- $(70 - 75)^2 = 25$
- $(80 - 75)^2 = 25$
- $(90 - 75)^2 = 225$
Sum of squared deviations: $500$.
Now, for the sample variance we divide by $n - 1 = 4 - 1 = 3$:
$$
s^2 = \frac{500}{3} \approx 166.67.
$$
So as a sample variance, the value is larger than the population variance (125), reflecting the correction for estimation from a sample.
Shortcut (computational) formula
For hand or computer calculations, it can sometimes be easier to use an equivalent formula that does not require computing every deviation first.
For a population, another formula for variance is
$$
\sigma^2 = \frac{1}{N} \sum_{i=1}^N x_i^2 - \mu^2.
$$
For a sample, the corresponding form is
$$
s^2 = \frac{1}{n - 1} \left( \sum_{i=1}^n x_i^2 - n\bar{x}^2 \right).
$$
The idea is:
- Compute $\sum x_i$ and $\sum x_i^2$.
- Use them to calculate $\mu$ or $\bar{x}$, then plug into these formulas.
These are algebraically equivalent to the original definitions, but sometimes faster for large data sets or programmable calculations.
Relationship between variance and standard deviation
Standard deviation (covered in its own chapter) is directly based on variance:
- Population standard deviation:
$$
\sigma = \sqrt{\sigma^2}.
$$ - Sample standard deviation:
$$
s = \sqrt{s^2}.
$$
In words: standard deviation is the square root of variance.
Because variance uses squared units, taking the square root returns to the original units of the data, which often makes the standard deviation easier to interpret. Variance is still very important mathematically and in many formulas, even when we talk more intuitively in terms of standard deviation.
Interpreting variance in practice
Here are some general guidelines to help you interpret variance:
- Zero variance:
- All data values are equal.
- Every deviation from the mean is 0, so the sum of squared deviations is 0.
- Relative size of variance:
- Comparing variances is only meaningful when the data are on the same scale.
- If dataset A and dataset B measure the same quantity (e.g., both are exam scores), and A has variance $20$ while B has variance $80$, B is more variable than A; its values are more spread out from the mean.
- Units and squared units:
- If data are in centimeters, variance is in square centimeters.
- For communication, people often translate to standard deviation; however, in formulas for many statistical methods, variance is the core underlying quantity.
Example of comparison
Suppose you have two classes of test scores.
- Class 1: variance $= 10$
- Class 2: variance $= 50$
Both classes have the same mean score (say, $75$), but Class 2’s scores are more widely spread. There might be more very high and very low scores in Class 2, while Class 1’s scores are more tightly clustered around 75.
Basic properties of variance
Variance has several important mathematical properties. Here are some of the most useful, stated informally.
Non-negativity
Variance is always greater than or equal to 0:
$$
\sigma^2 \ge 0,\quad s^2 \ge 0.
$$
It is 0 if and only if all data values are equal.
Effect of adding a constant
If you add the same constant $c$ to every data value, the variance does not change.
Let $y_i = x_i + c$ for all $i$. Then:
$$
\text{Var}(y_1, \dots, y_n) = \text{Var}(x_1, \dots, x_n).
$$
Reason: adding a constant shifts all values and the mean by the same amount, but it does not change how far values are from the mean.
Effect of multiplying by a constant
If you multiply every data value by a constant $a$, the variance is multiplied by $a^2$.
Let $y_i = a x_i$ for all $i$. Then:
$$
\text{Var}(y_1, \dots, y_n) = a^2 \,\text{Var}(x_1, \dots, x_n).
$$
Example: if temperature measurements in degrees Celsius are converted to degrees Fahrenheit using a linear transformation, the variance changes according to the square of the scaling factor.
Combining these transformations
For a linear transformation $y_i = a x_i + c$:
- Adding $c$ does not change variance.
- Multiplying by $a$ multiplies variance by $a^2$.
So overall:
$$
\text{Var}(a x_1 + c,\dots,a x_n + c) = a^2 \,\text{Var}(x_1,\dots,x_n).
$$
This property is especially important when rescaling data (for example, when converting units or standardizing variables).
Variance is the central measure of spread in many areas of probability and statistics. Understanding how it is defined, when to use the population versus sample formula, and how to interpret its size will prepare you for later topics such as standard deviation, probability distributions, and inferential methods that rely on variance.