Kahibaro
Discord Login Register

Discrete variables

In this chapter we focus on discrete random variables: random variables that take only isolated, separate values (often whole numbers), not every value in an interval.

You should already be familiar with the basic idea of a random variable from the parent chapter; here we specialize to the discrete case.

What makes a variable “discrete”?

A random variable $X$ is called discrete if:

In contrast, continuous random variables can take any value in an interval (these are treated in the chapter on continuous variables).

Typical situations that lead to discrete random variables involve counting:

If you count “how many” or classify into a finite number of categories, you are usually dealing with a discrete random variable.

Probability mass function (pmf)

For discrete random variables, probabilities are described by a probability mass function (pmf).

For a discrete random variable $X$, its pmf is a function $p$ defined by
$$
p(x) = P(X = x).
$$

So $p(x)$ is the probability that $X$ takes the exact value $x$.

Because $X$ is discrete, we can list all possible values $x_1, x_2, x_3, \dots$ that $X$ can take, and assign a probability to each:

You can think of the pmf as a table or a list:

Or a non-uniform example: suppose $X$ can take values $0,1,2$ with
$$
P(X=0)=0.2,\quad P(X=1)=0.5,\quad P(X=2)=0.3.
$$
This defines the pmf $p$; note that $0.2+0.5+0.3=1$.

Probabilities of events using the pmf

For a discrete random variable, probabilities of events are computed by adding pmf values:

This “add the probabilities of each value” rule is specific to discrete variables and is simpler than in the continuous case.

Distribution tables and probability histograms

Because there are separate possible values, discrete distributions are often shown as:

For a discrete variable, the height of each bar is the probability (not a density). The total “sum of the bar heights” is 1 (if each bar has width 1).

Common examples of discrete random variables

Several standard types of discrete random variables appear frequently. Their detailed formulas belong to later chapters on specific distributions, but it is useful here to see the types of situations that lead to discrete variables.

1. Bernoulli-type variables (success/failure)

These have only two possible outcomes, often coded as:

Examples:

$X$ takes values in $\{0,1\}$, so it is clearly discrete.

2. Counts in a fixed number of trials (binomial-type)

Here you repeat the same experiment a fixed number of times and count how many successes you get.

Examples:

Then $X$ takes values $0,1,2,\dots,n$ for some fixed $n$, a finite set.

3. Counts over an interval (Poisson-type)

Here you count how many times something happens in a fixed period or region.

Examples:

$X$ takes values in $\{0,1,2,3,\dots\}$, which is countably infinite, so it is still discrete.

4. Discrete outcomes from simple experiments

Some experiments have a small set of outcomes that are naturally discrete:

All of these are discrete variables with finite possible value sets.

Expected value and variance (discrete case)

The parent chapter on random variables introduces the ideas of expected value (mean) and variance. For discrete random variables, these concepts have specific formulas using sums.

Let $X$ be a discrete random variable with possible values $x_1, x_2, x_3, \dots$ and pmf $p$.

Expected value (mean)

The expected value of $X$, written $E[X]$ or $\mu$, is
$$
E[X] = \sum_{i} x_i \, p(x_i).
$$

You multiply each possible value by its probability and add them all up.

Example: Let $X$ take values $0,1,2$ with probabilities
$$
P(X=0)=0.2,\quad P(X=1)=0.5,\quad P(X=2)=0.3.
$$
Then
$$
E[X] = 0\cdot 0.2 + 1\cdot 0.5 + 2\cdot 0.3 = 0 + 0.5 + 0.6 = 1.1.
$$

Variance

The variance of $X$, written $\operatorname{Var}(X)$, measures how spread out the values of $X$ are around the mean:
$$
\operatorname{Var}(X) = E\big[(X - E[X])^2\big].
$$

For a discrete variable, this becomes a sum:
$$
\operatorname{Var}(X) = \sum_{i} (x_i - E[X])^2 \, p(x_i).
$$

A very useful equivalent formula in the discrete case is
$$
\operatorname{Var}(X) = E[X^2] - (E[X])^2,
$$
where
$$
E[X^2] = \sum_{i} x_i^2 \, p(x_i).
$$

(How to compute and interpret these in more detail is developed in the chapters on descriptive statistics and probability distributions; here the focus is on noting the sum structure specific to discrete variables.)

Cumulative distribution for a discrete variable

The cumulative distribution function (cdf) $F$ of a random variable $X$ is defined in general as
$$
F(x) = P(X \le x).
$$

For a discrete random variable, this becomes a step function:

So $F$ grows in discrete jumps, reflecting the underlying discrete nature of $X$.

Working with multiple discrete random variables

When dealing with more than one discrete random variable, we again use sums.

Suppose $X$ and $Y$ are discrete random variables that can take values $x_i$ and $y_j$ respectively.

All these formulas rely on sums over discrete sets of possible values.

Recognizing when a model should be discrete

In practical problems, deciding whether to use a discrete random variable is often the first step.

You typically model with a discrete random variable when:

On the other hand, if the quantity can vary smoothly over an interval and fractional values are meaningful (time, distance, temperature), you usually choose a continuous model (handled in the chapter on continuous variables).

Summary specific to discrete variables

Views: 11

Comments

Please login to add a comment.

Don't have an account? Register now!