Kahibaro
Discord Login Register

Random Variables

In probability, we often care about uncertain numerical outcomes. Instead of just saying “something random happens,” we want a precise way to talk about the possible values it can take and how likely each value (or range of values) is. This is where random variables come in.

A random variable is a rule that assigns a number to each outcome of a random process. It is not “random” in the sense of changing its mind; the rule itself is fixed. The randomness comes from which outcome the experiment produces.

Formally, think of an experiment with a sample space $S$ (all possible outcomes). A random variable $X$ assigns to each outcome $s \in S$ a real number $X(s)$. We usually write $X$ simply as the “random outcome” itself, but always remember it is a function on outcomes.

Examples:

The sample space and probabilities are defined in the Probability Basics part of the course; here we focus on how we represent and use the associated numerical random variables.

Types of Random Variables

There are two main types, each with its own way of describing probabilities.

Discrete Random Variables

A discrete random variable takes values in a set that is countable (you can list the values, even if the list is infinite). Commonly, these are integers or a finite set of numbers.

Examples:

We describe a discrete random variable using a probability mass function (pmf). The pmf of a discrete random variable $X$ is a function $p$ defined by
$$
p(x) = P(X = x),
$$
for each value $x$ that $X$ can take. It must satisfy:

Example (a fair die):

We can list the pmf in a small table:

Continuous Random Variables

A continuous random variable can take values in an interval (or union of intervals) of real numbers, with infinitely many possible values that cannot be listed one by one.

Examples:

For continuous random variables, the probability that $X$ is exactly any single specific value (like $P(X = 2.5)$) is usually $0$; instead, we talk about probabilities for intervals, such as $P(2 < X \le 3)$.

We describe a continuous random variable using a probability density function (pdf), often written $f(x)$, which must satisfy:

Probabilities are areas under the curve:
$$
P(a \le X \le b) = \int_{a}^{b} f(x)\,dx.
$$

The details of specific distributions, like the normal distribution, are covered in the Probability Distributions chapter; here the important idea is that discrete random variables use a pmf (probability assigned to each value), and continuous ones use a pdf (probability as area under a curve over intervals).

Distribution of a Random Variable

The distribution of a random variable describes how its possible values are spread out, together with the probabilities of those values (or ranges of values).

The cumulative distribution function (cdf) of a random variable $X$ is defined for any real number $x$ by
$$
F(x) = P(X \le x).
$$

For discrete $X$, this is a sum:
$$
F(x) = \sum_{t \le x} P(X = t).
$$

For continuous $X$, this is an integral:
$$
F(x) = \int_{-\infty}^{x} f(t)\,dt.
$$

The cdf $F(x)$ always:

The Probability Distributions chapter focuses on important specific distributions; here we emphasize the general idea that every random variable has a distribution, which can be described by a pmf, pdf, or cdf.

Expected Value and Variance (Conceptual Overview)

Random variables can be summarized by numerical characteristics. Two of the most important are expected value (or mean) and variance. Full computational details and practice will appear in Descriptive Statistics, but we outline the connection here.

Expected Value (Mean)

The expected value of a random variable $X$ is a kind of long-run average you would expect if you could repeat the experiment many times.

Example (fair die):

You can never actually roll a $3.5$, but it is the average outcome in the long run.

Variance and Standard Deviation

The variance of $X$ measures how spread out the values of $X$ are around its mean. It is defined as
$$
\mathrm{Var}(X) = E[(X - E[X])^2].
$$

The standard deviation is the square root of the variance:
$$
\mathrm{SD}(X) = \sqrt{\mathrm{Var}(X)}.
$$

A small variance (and standard deviation) means the random variable usually takes values close to its mean; a large variance means the values are more spread out.

The explicit formulas and examples for variance and standard deviation are covered in the Descriptive Statistics part of the course. Here it is enough to connect them conceptually to random variables as measures of typical value and spread.

Functions of a Random Variable

Often we apply a function to a random variable to create a new random variable. If $X$ is a random variable and $g$ is a deterministic function, then
$$
Y = g(X)
$$
is also a random variable.

Examples:

The distribution of $Y$ depends on the distribution of $X$ and the function $g$. In later chapters (especially Probability Distributions and beyond), we study how to find the distribution of transformed random variables, especially for sums and linear combinations (like $aX + b$).

Multiple Random Variables and Independence (Preview)

Sometimes we consider more than one random variable at the same time, defined on the same underlying experiment.

Example:

We can talk about:

The formal treatment of joint distributions, dependence, and independence is done in more advanced probability study. Here the key idea is that each such quantity you define (number of heads, waiting time, height, etc.) is a random variable, and they can be studied alone or together.

Why Random Variables Matter

Random variables are the basic tools for connecting probability to numbers and data:

Whenever you see words like “the number of,” “the time until,” “the amount of,” “the height of,” describing an uncertain quantity, you are looking at a random variable. Understanding them provides the bridge between abstract probability and real-world data.

Views: 15

Comments

Please login to add a comment.

Don't have an account? Register now!