Table of Contents
What It Means to Diagonalize a Matrix
In this chapter, we assume you already understand eigenvalues, eigenvectors, and the idea of an eigenbasis from the parent chapter.
Diagonalization is the process of rewriting a linear transformation (or matrix) in a very simple form: as a diagonal matrix. A diagonal matrix is one where all entries off the main diagonal are zero, like
$$
D = \begin{bmatrix}
d_1 & 0 & 0 \\
0 & d_2 & 0 \\
0 & 0 & d_3
\end{bmatrix}.
$$
The key idea is:
- A matrix $A$ is diagonalizable if there is an invertible matrix $P$ and a diagonal matrix $D$ such that
$$
A = P D P^{-1}.
$$
Here:
- The columns of $P$ are eigenvectors of $A$.
- The diagonal entries of $D$ are the corresponding eigenvalues of $A$.
So diagonalization reorganizes $A$ into a form where its action is very simple: along certain directions (the eigenvectors), it just stretches by factors (the eigenvalues).
The Diagonalization Equation $A = P D P^{-1}$
Let $A$ be an $n \times n$ matrix. Suppose $A$ has $n$ linearly independent eigenvectors:
$$
v_1, v_2, \dots, v_n
$$
with corresponding eigenvalues
$$
\lambda_1, \lambda_2, \dots, \lambda_n.
$$
Form the matrix $P$ whose columns are these eigenvectors:
$$
P = \begin{bmatrix}
| & | & & | \\
v_1 & v_2 & \cdots & v_n \\
| & | & & |
\end{bmatrix},
$$
and the diagonal matrix $D$ whose diagonal entries are these eigenvalues in the same order:
$$
D = \begin{bmatrix}
\lambda_1 & 0 & \cdots & 0 \\
0 & \lambda_2 & \cdots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & \lambda_n
\end{bmatrix}.
$$
Then we have the diagonalization relation
$$
A P = P D,
$$
because
$$
A v_i = \lambda_i v_i \quad\text{for each } i.
$$
If $P$ is invertible (which is the same as saying that the eigenvectors $v_1,\dots,v_n$ are linearly independent), then we can multiply on the right by $P^{-1}$:
$$
A = P D P^{-1}.
$$
This equation is what we mean by "$A$ is diagonalizable" (over the field we are working in, typically $\mathbb{R}$ or $\mathbb{C}$).
When Is a Matrix Diagonalizable?
The basic criterion is:
- An $n \times n$ matrix $A$ is diagonalizable (over a given field) if and only if it has $n$ linearly independent eigenvectors (over that field).
Put differently:
- There must be a basis of the vector space consisting entirely of eigenvectors of $A$.
- Such a basis is called an eigenbasis.
Some useful points:
- If an $n \times n$ matrix has $n$ distinct eigenvalues, then it is diagonalizable. (Each eigenvalue has at least one eigenvector, and eigenvectors for distinct eigenvalues are linearly independent.)
- Having repeated eigenvalues does not automatically prevent diagonalization, but it makes it more delicate.
Two notions from eigenvalue theory matter here:
- The algebraic multiplicity of an eigenvalue $\lambda$ is how many times it appears as a root of the characteristic polynomial.
- The geometric multiplicity of $\lambda$ is the dimension of its eigenspace (the number of linearly independent eigenvectors with eigenvalue $\lambda$).
For diagonalization:
- For each eigenvalue $\lambda$, its geometric multiplicity must equal its algebraic multiplicity.
- Also, the sum of the geometric multiplicities over all eigenvalues must be $n$ (so we get $n$ independent eigenvectors in total).
If these conditions fail, the matrix is not diagonalizable (over that field).
How to Diagonalize a Matrix in Practice
To diagonalize a matrix $A$, one typically follows these steps:
- Find the eigenvalues of $A$.
Solve $\det(A - \lambda I) = 0$ to get the eigenvalues $\lambda_1,\dots,\lambda_k$, where some may repeat. - For each eigenvalue, find eigenvectors.
For each $\lambda_i$, solve
$$
(A - \lambda_i I)v = 0
$$
to find a basis for the eigenspace. - Check if you have enough eigenvectors.
Collect all the eigenvectors from all eigenspaces. - If you can select $n$ linearly independent eigenvectors, then $A$ is diagonalizable.
- If not, $A$ is not diagonalizable.
- Form $P$ and $D$.
- Choose $n$ linearly independent eigenvectors and arrange them as columns of $P$.
- For each eigenvector placed in column $i$ of $P$, put its eigenvalue in the $i$-th diagonal position of $D$.
Then $A = P D P^{-1}$.
Note: Different choices of eigenvectors (within the eigenspaces) lead to different $P$ matrices but all give a valid diagonalization, with the same diagonal entries in $D$ up to reordering.
Why Diagonalization Is Useful
Diagonalization turns a possibly complicated linear transformation into one that is easy to understand and compute with, in the right basis.
Here are two key computational advantages:
- Computing powers of $A$
Suppose $A = P D P^{-1}$ and we want $A^k$ for a positive integer $k$:
$$
A^k = (P D P^{-1})(P D P^{-1}) \cdots (P D P^{-1})
= P D^k P^{-1},
$$
because the $P^{-1}P$ terms cancel in the middle.
But $D^k$ is easy to compute, since $D$ is diagonal:
$$
D^k = \begin{bmatrix}
\lambda_1^k & 0 & \cdots & 0 \\
0 & \lambda_2^k & \cdots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & \lambda_n^k
\end{bmatrix}.
$$
So $A^k$ reduces to:
- Raise each eigenvalue to the $k$-th power.
- Conjugate back by $P$ and $P^{-1}$.
- Understanding the action of $A$
In the eigenbasis, $A$ simply scales each basis vector by its eigenvalue. This makes: - Long-term behavior of repeated applications of $A$ easier to understand.
- Certain systems of difference or differential equations easier to solve (because they become decoupled in the eigenbasis).
Other uses (treated in other chapters) include simplifying quadratic forms and analyzing linear dynamical systems; diagonalization plays a central role in these topics.
Examples of Diagonalizable and Non-Diagonalizable Matrices
Consider the two $2 \times 2$ matrices:
$$
A = \begin{bmatrix}
3 & 0 \\
0 & 5
\end{bmatrix},
\quad
B = \begin{bmatrix}
2 & 1 \\
0 & 2
\end{bmatrix}.
$$
- $A$ is already diagonal. So it is diagonalizable, with
- $P = I$ (the identity),
- $D = A$.
- For $B$:
- The only eigenvalue is $\lambda = 2$ (with algebraic multiplicity $2$).
- Solve $(B - 2I)v = 0$:
$$
B - 2I = \begin{bmatrix}
0 & 1 \
0 & 0
\end{bmatrix},
$$
so any eigenvector $v = \begin{bmatrix}x \ y\end{bmatrix}$ must satisfy $y = 0$.
The eigenspace is all multiples of $\begin{bmatrix}1 \ 0\end{bmatrix}$, which is 1-dimensional.
Here the geometric multiplicity (1) is less than the algebraic multiplicity (2), so $B$ is not diagonalizable.
Real vs Complex Diagonalization
Diagonalizability depends on which field you work over.
- A real matrix might not be diagonalizable over $\mathbb{R}$ (no real eigenbasis), but is diagonalizable over $\mathbb{C}$ (has a complex eigenbasis).
- For diagonalization over $\mathbb{R}$, you need enough real eigenvectors.
- For diagonalization over $\mathbb{C}$, you allow complex eigenvalues and complex eigenvectors, which sometimes makes diagonalization possible when it was not over $\mathbb{R}$.
When discussing diagonalization, it is therefore important to specify the field (usually understood from context if not stated explicitly).
Summary
- A square matrix $A$ is diagonalizable if there exists an invertible matrix $P$ and a diagonal matrix $D$ such that $A = P D P^{-1}$.
- This happens exactly when there is a basis of the space consisting entirely of eigenvectors of $A$ (an eigenbasis).
- The diagonal entries of $D$ are eigenvalues of $A$, and the columns of $P$ are the corresponding eigenvectors.
- Diagonalization greatly simplifies computations, especially of powers $A^k$, and clarifies the structure and behavior of linear transformations.