13.3.1 Binomial distribution

Table of Contents

The binomial distribution is one of the most important probability distributions for discrete random variables that count how many “successes” occur in a fixed number of repeated, independent trials.

In this chapter, we focus on:

When a binomial model is appropriate
How to compute binomial probabilities
Key characteristics: mean and variance
Using the binomial distribution in basic problem solving

The general ideas of probability distributions and random variables are assumed from the parent chapters.

Situations modeled by the binomial distribution

A random variable $X$ has a binomial distribution when all of the following conditions are met:

Fixed number of trials
There is a fixed number $n$ of trials (repetitions of an experiment).
Two possible outcomes on each trial
Each trial results in one of two outcomes, often called “success” and “failure.”
(These are just labels; “success” does not have to be something good.)
Constant probability of success
The probability of success on each trial is the same number $p$, with

Independence of trials
The result of any trial does not affect the result of any other trial.

When these conditions hold, we can say:

$X$ = number of successes in $n$ trials
$X$ follows a binomial distribution with parameters $n$ and $p$.

We write this as:
$$
X \sim \text{Binomial}(n, p)
$$

Examples of binomial situations

Flipping a fair coin $10$ times:

Trial: one coin flip
Success: “heads”
$n = 10$, $p = 0.5$
$X$ = number of heads in $10$ flips

Checking $20$ light bulbs from a large production line, where each bulb is defective with probability $0.02$:

Trial: checking one bulb
Success: “defective bulb”
$n = 20$, $p = 0.02$
$X$ = number of defective bulbs among the $20$

Asking $15$ people whether they like a new product, assuming each person independently likes it with probability $0.3$:

Trial: asking one person
Success: “person likes product”
$n = 15$, $p = 0.3$
$X$ = number of people who like the product

If any of the binomial conditions do not hold (for example, the probability of success changes from trial to trial or trials are not independent), then the binomial model is not appropriate.

Binomial probability formula

Suppose $X \sim \text{Binomial}(n, p)$.
The possible values of $X$ are the integers
$$
0, 1, 2, \dots, n.
$$

The probability that $X$ equals exactly $k$ (that is, exactly $k$ successes) is given by the binomial formula:
$$
P(X = k) = \binom{n}{k} p^k (1 - p)^{n-k},
$$
for $k = 0, 1, 2, \dots, n$.

Here, $\binom{n}{k}$ is the binomial coefficient (“$n$ choose $k$”), which counts how many different ways we can choose $k$ successes out of $n$ trials. It is defined by:
$$
\binom{n}{k} = \frac{n!}{k!(n-k)!}.
$$

How to interpret the formula

To get exactly $k$ successes in $n$ independent trials:

The probability of any particular sequence of $k$ successes and $n-k$ failures is
$$
p^k (1-p)^{n-k},
$$
because we multiply $p$ for each success and $(1-p)$ for each failure.
There are $\binom{n}{k}$ different sequences (arrangements) that contain $k$ successes and $n-k$ failures.

Multiplying these together gives:
$$
P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}.
$$

Example: coin flips

Flip a fair coin ($p = 0.5$ for heads) $n = 4$ times. Let $X$ be the number of heads.

Exactly 2 heads
Here, $n = 4$, $p = 0.5$, $k = 2$:
$$
P(X = 2)
= \binom{4}{2} (0.5)^2 (0.5)^{4-2}
= \binom{4}{2} (0.5)^4
= 6 \cdot \frac{1}{16}
= \frac{6}{16}
= \frac{3}{8}.
$$
Exactly 0 heads (so all tails)
$k = 0$:
$$
P(X = 0)
= \binom{4}{0} (0.5)^0 (0.5)^4
= 1 \cdot 1 \cdot \frac{1}{16}
= \frac{1}{16}.
$$
Exactly 4 heads
$k = 4$:
$$
P(X = 4)
= \binom{4}{4} (0.5)^4 (0.5)^0
= 1 \cdot \frac{1}{16} \cdot 1
= \frac{1}{16}.
$$

The other possible values $k = 1$ and $k = 3$ can be found in the same way.

Cumulative binomial probabilities

Sometimes we are interested in the probability that the number of successes is at most, at least, or between certain values, rather than exactly one value.

For $X \sim \text{Binomial}(n, p)$:

At most $k$ successes:
$$
P(X \le k) = P(X = 0) + P(X = 1) + \cdots + P(X = k).
$$
At least $k$ successes:
$$
P(X \ge k) = P(X = k) + P(X = k+1) + \cdots + P(X = n).
$$
Often, it is easier to use the complement:
$$
P(X \ge k) = 1 - P(X \le k-1).
$$
Between $a$ and $b$ successes (inclusive):
$$
P(a \le X \le b) = P(X = a) + P(X = a+1) + \cdots + P(X = b).
$$

For small $n$, you can compute these by hand by summing the binomial probabilities. For larger $n$, it is common to use a calculator, spreadsheet, or software with built-in binomial functions.

Example: at least one success

Suppose $X \sim \text{Binomial}(n=5, p=0.2)$ and we want $P(X \ge 1)$.

It is usually easier to compute the complement “no successes”:

$P(X = 0) = \binom{5}{0} (0.2)^0 (0.8)^5 = 1 \cdot 1 \cdot 0.8^5$.
Thus
$$
P(X \ge 1) = 1 - P(X = 0) = 1 - 0.8^5.
$$

This is often faster than adding $P(X=1) + P(X=2) + \dots + P(X=5)$.

Mean and variance of a binomial distribution

For a binomial random variable $X \sim \text{Binomial}(n, p)$, the mean and variance have simple formulas.

Mean (expected value):
$$
\mu = E[X] = np.
$$
Variance:
$$
\sigma^2 = \text{Var}(X) = np(1-p).
$$
Standard deviation:
$$
\sigma = \sqrt{np(1-p)}.
$$

These formulas come from thinking of $X$ as the sum of $n$ independent “success/failure” variables, each with mean $p$ and variance $p(1-p)$.

Interpreting the mean

If $X$ counts successes in $n$ trials, and the probability of success is $p$, then:

On average, you expect about $np$ successes.

For example, if $n = 100$ and $p = 0.3$, then:

Mean: $E[X] = 100 \cdot 0.3 = 30$.

This does not mean you will always get exactly $30$ successes, but over many repetitions of the same experiment, the average number of successes will be close to $30$.

Example with mean and variance

Let $X \sim \text{Binomial}(n=10, p=0.4)$.

Mean:
$$
E[X] = np = 10 \cdot 0.4 = 4.
$$
Variance:
$$
\text{Var}(X) = np(1-p) = 10 \cdot 0.4 \cdot 0.6 = 2.4.
$$
Standard deviation:
$$
\sigma = \sqrt{2.4}.
$$

So, in $10$ trials, you expect about $4$ successes on average, with typical variation around that average measured by $\sqrt{2.4}$.

Shape of the binomial distribution

The binomial distribution is a discrete distribution, so it is usually represented by a bar chart (a probability mass function).

The shape depends on $n$ and $p$:

When $p = 0.5$, the distribution is symmetric around the mean $np$.
When $p < 0.5$, the distribution is skewed to the right (more mass near $0$).
When $p > 0.5$, the distribution is skewed to the left (more mass near $n$).

As $n$ becomes large, the binomial distribution can start to resemble a bell-shaped curve (a normal distribution) under certain conditions. Approximations of this type are usually treated in more advanced chapters.

Basic problem-solving with the binomial distribution

To solve a binomial problem, you typically:

Identify $n$ and $p$
Check that the situation fits the binomial conditions. Determine the number of trials $n$ and the success probability $p$.
Define the random variable $X$ clearly
State what $X$ counts (e.g., “$X$ = number of customers who buy the product”).
Use the binomial formula or cumulative sums

For “exactly $k$,” use
$$
P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}.
$$
For “at most,” “at least,” or “between” statements, sum several $P(X=k)$ terms or use complements.

If needed, find mean and standard deviation
Use $E[X] = np$ and $\sigma = \sqrt{np(1-p)}$.

Example: quality control

A factory produces items, and each item has probability $p = 0.1$ of being defective, independently of the others. A quality inspector tests $n = 8$ items. Let $X$ be the number of defective items among the $8$.

Probability of exactly 2 defective items
$$
P(X = 2) = \binom{8}{2} (0.1)^2 (0.9)^6.
$$
Probability of at most 1 defective item
$$
P(X \le 1) = P(X = 0) + P(X = 1)
$$
with
$$
P(X=0) = \binom{8}{0} (0.1)^0 (0.9)^8 = 0.9^8,
$$
$$
P(X=1) = \binom{8}{1} (0.1)^1 (0.9)^7 = 8 \cdot 0.1 \cdot 0.9^7.
$$
Expected number of defective items
$$
E[X] = np = 8 \cdot 0.1 = 0.8.
$$

So on average, out of 8 tested items, the inspector expects $0.8$ defective items.

Summary

A binomial distribution models the number of successes in $n$ independent trials, each with the same success probability $p$.
We write $X \sim \text{Binomial}(n, p)$ when $X$ counts the number of successes.
The probability of exactly $k$ successes is
$$
P(X = k) = \binom{n}{k} p^k (1 - p)^{n-k}.
$$
The mean and variance are
$$
E[X] = np, \quad \text{Var}(X) = np(1-p).
$$
Binomial methods are widely used to model counts of events such as correct answers, defective items, or positive responses, whenever the binomial conditions reasonably hold.

Comments

Please login to add a comment.

Don't have an account? Register now!