Kahibaro
Discord Login Register

13.4.2 Variance

Variance is a numerical measure of how spread out a set of values is around its mean. While the mean tells you a “central” value, variance tells you how much the values typically deviate from that center.

In this chapter we focus on:

The ideas of data, mean, and standard deviation are assumed from other chapters in Descriptive Statistics; here we concentrate on variance itself.

What variance measures

Suppose you have a collection of numbers (a data set):

$$
x_1, x_2, x_3, \dots, x_n
$$

and their mean (average) is $\bar{x}$ (or $\mu$ for a population).

Variance measures how far, on average, the values $x_i$ are from the mean, but with a key twist:

$$
(x_1 - \bar{x})^2,\ (x_2 - \bar{x})^2,\ \dots,\ (x_n - \bar{x})^2
$$

Because the deviations are squared:

In words:

Population variance

When your data represent an entire population (not just a subset), the population variance is usually denoted by $\sigma^2$ (the Greek letter sigma squared).

Let a population have $N$ values:

$$
x_1, x_2, \dots, x_N
$$

with population mean

$$
\mu = \frac{1}{N}\sum_{i=1}^N x_i.
$$

The population variance is defined as

$$
\sigma^2 = \frac{1}{N}\sum_{i=1}^N (x_i - \mu)^2.
$$

Step by step:

  1. Compute the population mean $\mu$.
  2. For each value $x_i$, compute its deviation from the mean: $x_i - \mu$.
  3. Square each deviation: $(x_i - \mu)^2$.
  4. Add up all the squared deviations: $\sum_{i=1}^N (x_i - \mu)^2$.
  5. Divide by $N$.

Because you divide by $N$, this is exactly the average squared deviation from the mean for the whole population.

Example: population variance

Suppose the (tiny) population of exam scores is:

$$
60,\ 70,\ 80,\ 90
$$

  1. Mean:
    $$
    \mu = \frac{60 + 70 + 80 + 90}{4} = \frac{300}{4} = 75.
    $$
  2. Deviations from the mean:
    • $60 - 75 = -15$
    • $70 - 75 = -5$
    • $80 - 75 = 5$
    • $90 - 75 = 15$
  3. Squared deviations:
    • $(-15)^2 = 225$
    • $(-5)^2 = 25$
    • $5^2 = 25$
    • $15^2 = 225$
  4. Sum of squared deviations:
    $$
    225 + 25 + 25 + 225 = 500.
    $$
  5. Divide by $N = 4$:
    $$
    \sigma^2 = \frac{500}{4} = 125.
    $$

So the population variance is $125$ (in “points squared”).

Sample variance

In practice, you rarely have access to an entire population. Instead, you often have a sample drawn from that population. When you calculate variance from a sample and want to use it as an estimate of the population variance, you use a slightly different formula.

Let your sample have $n$ values:

$$
x_1, x_2, \dots, x_n
$$

with sample mean

$$
\bar{x} = \frac{1}{n}\sum_{i=1}^n x_i.
$$

The sample variance is usually denoted by $s^2$ and defined as

$$
s^2 = \frac{1}{n - 1}\sum_{i=1}^n (x_i - \bar{x})^2.
$$

The only visible difference from the population formula is the denominator: $n - 1$ instead of $n$.

Why $n - 1$?

You do not need the full proof here; just remember:

Example: sample variance

Take the same four scores, but now treat them as a sample, not a population:

$$
60,\ 70,\ 80,\ 90
$$

The sample mean $\bar{x}$ is the same as before:

$$
\bar{x} = 75.
$$

We reuse the squared deviations we already computed:

Sum of squared deviations: $500$.

Now, for the sample variance we divide by $n - 1 = 4 - 1 = 3$:

$$
s^2 = \frac{500}{3} \approx 166.67.
$$

So as a sample variance, the value is larger than the population variance (125), reflecting the correction for estimation from a sample.

Shortcut (computational) formula

For hand or computer calculations, it can sometimes be easier to use an equivalent formula that does not require computing every deviation first.

For a population, another formula for variance is

$$
\sigma^2 = \frac{1}{N} \sum_{i=1}^N x_i^2 - \mu^2.
$$

For a sample, the corresponding form is

$$
s^2 = \frac{1}{n - 1} \left( \sum_{i=1}^n x_i^2 - n\bar{x}^2 \right).
$$

The idea is:

These are algebraically equivalent to the original definitions, but sometimes faster for large data sets or programmable calculations.

Relationship between variance and standard deviation

Standard deviation (covered in its own chapter) is directly based on variance:

In words: standard deviation is the square root of variance.

Because variance uses squared units, taking the square root returns to the original units of the data, which often makes the standard deviation easier to interpret. Variance is still very important mathematically and in many formulas, even when we talk more intuitively in terms of standard deviation.

Interpreting variance in practice

Here are some general guidelines to help you interpret variance:

Example of comparison

Suppose you have two classes of test scores.

Both classes have the same mean score (say, $75$), but Class 2’s scores are more widely spread. There might be more very high and very low scores in Class 2, while Class 1’s scores are more tightly clustered around 75.

Basic properties of variance

Variance has several important mathematical properties. Here are some of the most useful, stated informally.

Non-negativity

Variance is always greater than or equal to 0:

$$
\sigma^2 \ge 0,\quad s^2 \ge 0.
$$

It is 0 if and only if all data values are equal.

Effect of adding a constant

If you add the same constant $c$ to every data value, the variance does not change.

Let $y_i = x_i + c$ for all $i$. Then:

$$
\text{Var}(y_1, \dots, y_n) = \text{Var}(x_1, \dots, x_n).
$$

Reason: adding a constant shifts all values and the mean by the same amount, but it does not change how far values are from the mean.

Effect of multiplying by a constant

If you multiply every data value by a constant $a$, the variance is multiplied by $a^2$.

Let $y_i = a x_i$ for all $i$. Then:

$$
\text{Var}(y_1, \dots, y_n) = a^2 \,\text{Var}(x_1, \dots, x_n).
$$

Example: if temperature measurements in degrees Celsius are converted to degrees Fahrenheit using a linear transformation, the variance changes according to the square of the scaling factor.

Combining these transformations

For a linear transformation $y_i = a x_i + c$:

So overall:

$$
\text{Var}(a x_1 + c,\dots,a x_n + c) = a^2 \,\text{Var}(x_1,\dots,x_n).
$$

This property is especially important when rescaling data (for example, when converting units or standardizing variables).


Variance is the central measure of spread in many areas of probability and statistics. Understanding how it is defined, when to use the population versus sample formula, and how to interpret its size will prepare you for later topics such as standard deviation, probability distributions, and inferential methods that rely on variance.

Views: 61

Comments

Please login to add a comment.

Don't have an account? Register now!