13.3 Probability Distributions

Table of Contents

In probability, we are often interested in more than just “what can happen” (the sample space) and “how likely is each event.” We also care about numerical outcomes: scores, heights, waiting times, profits, and so on. A probability distribution is a way to describe, in a precise and compact form, the probabilities of all possible values of a random variable.

This chapter gives a general introduction to probability distributions before you study particular examples (like the binomial and normal distributions) in later chapters.

Random variables and their distributions

A random variable is a variable whose value is determined by a random process. It assigns a number to each outcome of an experiment. For example:

Flip a coin three times and let $X$ be the number of heads.
Roll a die and let $Y$ be the square of the number that appears.
Watch a bus stop and let $T$ be the waiting time (in minutes) until the next bus.

The probability distribution of a random variable is the rule that tells you how probabilities are spread over its possible values.

Informally:

It lists (or describes) all possible values of the random variable.
It gives the probability associated with each value (or each range of values).

Discrete vs continuous distributions

Random variables, and therefore distributions, are usually divided into two types.

A random variable is discrete if it takes countable values (often integers): 0, 1, 2, … or a finite list.

Example: the number of defective items in a box of 10.

A random variable is continuous if it can take any value in an interval of real numbers.

Example: the exact time (in seconds) for a runner to complete a race.

These two types of variables have probability distributions described in different ways.

Discrete probability distributions

For a discrete random variable $X$, you can list each possible value and the probability that $X$ takes that value.

The function that assigns these probabilities is called the probability mass function (pmf) of $X$.

If the possible values of $X$ are $x_1, x_2, x_3, \dots$, then the pmf is the function
$$
p(x) = P(X = x).
$$
This function must satisfy:

$p(x) \ge 0$ for all possible $x$.
The sum of probabilities over all possible values equals 1:
$$
\sum_x p(x) = 1.
$$

Simple examples of discrete distributions

Single fair die
Let $X$ be the number rolled on a fair six-sided die. The possible values are $ 1,2,3,4,5,6$.
The probability mass function is
$$
p(x) = P(X = x) = \frac{1}{6} \quad \text{for } x = 1,2,3,4,5,6.
$$
For all other $x$, $p(x) = 0$.
Counting heads
Flip a fair coin twice and let $Y$ be the number of heads. The possible values are $ 0,1,2$.
A simple table can describe the distribution:

$y$	$P(Y = y)$
0	$1/4$
1	$1/2$
2	$1/4$

Tables like this are a common way to present discrete distributions.

Distribution functions for discrete variables

Besides the pmf, another useful way to describe a probability distribution is the cumulative distribution function (cdf).

For any real number $x$, the cdf $F(x)$ of a random variable $X$ is defined by
$$
F(x) = P(X \le x).
$$

For a discrete $X$, $F(x)$ increases in jumps at the possible values of $X$.

For example, if $Y$ is the number of heads when flipping a fair coin twice, then:

$F(0) = P(Y \le 0) = P(Y = 0) = 1/4$,
$F(1) = P(Y \le 1) = P(Y = 0 \text{ or } 1) = 1/4 + 1/2 = 3/4$,
$F(2) = P(Y \le 2) = 1$.

For $x$ between 0 and 1, $F(x) = 1/4$; for $x$ between 1 and 2, $F(x) = 3/4$; and for $x \ge 2$, $F(x) = 1$.

The cdf has some general properties that hold for any random variable:

$F(x)$ is non-decreasing (it never goes down as $x$ increases).
$\lim_{x \to -\infty} F(x) = 0$ and $\lim_{x \to \infty} F(x) = 1$.

Continuous probability distributions

For a continuous random variable $X$, the probability of taking any single exact value is 0:
$$
P(X = a) = 0 \quad \text{for any real } a.
$$

Instead, probabilities are assigned to intervals of values (for example, $P(a < X \le b)$).

The rule that assigns probabilities to intervals is often described using a probability density function (pdf), $f(x)$.

Informally, $f(x)$ is a non-negative function such that probabilities are given by areas under the curve of $f$:
$$
P(a \le X \le b) = \int_a^b f(x)\,dx.
$$

A pdf must satisfy:

$f(x) \ge 0$ for all $x$,
The total area under the curve is 1:
$$
\int_{-\infty}^{\infty} f(x)\,dx = 1.
$$

Example of a simple continuous distribution

Consider a random variable $X$ that is equally likely to take any value between 0 and 1 (this is called a uniform distribution on $[0,1]$).

The pdf is
$$
f(x) = \begin{cases}
1, & 0 \le x \le 1,\\[4pt]
0, & \text{otherwise.}
\end{cases}
$$

The total area under this curve from 0 to 1 is
$$
\int_0^1 1\,dx = 1,
$$
so it is a valid pdf.

The probability that $X$ lies between $0.2$ and $0.5$ is
$$
P(0.2 \le X \le 0.5) = \int_{0.2}^{0.5} 1\,dx = 0.3.
$$

Distribution functions for continuous variables

The cdf $F(x)$ of a continuous random variable $X$ is still defined by
$$
F(x) = P(X \le x).
$$

If $X$ has pdf $f(x)$, then
$$
F(x) = \int_{-\infty}^{x} f(t)\,dt.
$$

When the pdf is nice enough (for example, continuous), then:

$F(x)$ is continuous,
the pdf is the derivative of the cdf:
$$
f(x) = \frac{d}{dx} F(x).
$$

For the uniform example above,
$$
F(x) =
\begin{cases}
0, & x < 0,\\[4pt]
x, & 0 \le x \le 1,\\[4pt]
1, & x > 1.
\end{cases}
$$

Expected value and variance of a distribution

To summarize a probability distribution numerically, two key concepts are:

Expected value (mean): a measure of the “center” of the distribution.
Variance (and its square root, the standard deviation): a measure of how spread out the values are around the mean.

You will use these quantities frequently when working with specific distributions.

Expected value for discrete distributions

If $X$ is a discrete random variable with pmf $p(x)$, its expected value $E[X]$ (or $\mu$) is
$$
E[X] = \sum_x x\,p(x).
$$

This is a weighted average of the possible values, with weights equal to their probabilities.

The variance of $X$, denoted $\operatorname{Var}(X)$ or $\sigma^2$, is
$$
\operatorname{Var}(X) = E\big[(X - \mu)^2\big]
= \sum_x (x - \mu)^2 p(x).
$$

Expected value for continuous distributions

If $X$ is a continuous random variable with pdf $f(x)$, then the expected value is
$$
E[X] = \int_{-\infty}^{\infty} x\,f(x)\,dx.
$$

The variance is
$$
\operatorname{Var}(X) = \int_{-\infty}^{\infty} (x - \mu)^2 f(x)\,dx,
$$
where $\mu = E[X]$.

The details of computing $E[X]$ and $\operatorname{Var}(X)$ for particular distributions (like the binomial or normal) will be covered in their own chapters. What matters here is that:

every probability distribution has these summary measures (when they exist),
they help you compare and understand different distributions.

Interpreting and using probability distributions

A probability distribution is more than just a formula. It is a model of how a random variable behaves.

Some common uses:

Calculating event probabilities
Once you know the distribution of $X$, you can find probabilities such as $P(X = 3)$ (discrete) or $P(a < X < b)$ (continuous) using the pmf or pdf.
Modeling real-world situations
Different real-life processes are naturally modeled by different distributions. For example, counts of “successes” in repeated trials often follow a binomial distribution; many measurement errors and natural variations are close to a normal distribution.
Comparing scenarios
Two different processes may have the same mean but different variances. Their distributions tell you not only “on average” what to expect, but also how variable the outcomes are.
Simulating randomness
In computer simulations, you often use random number generators to produce values from specified distributions. The distribution describes exactly how the simulated numbers should behave.

In later chapters on the binomial and normal distributions, you will see concrete, widely used examples of probability distributions, and you will learn specific formulas and methods for working with them. Here, the key ideas are:

A random variable has a probability distribution.
For discrete variables, we use a pmf and possibly a table.
For continuous variables, we use a pdf and integrals (areas under curves).
The cdf applies to both types and gives $P(X \le x)$.
Expected value and variance summarize important features of a distribution.