Diagonalization
Subtopic: Eigentheory
Topic: Linear Algebra
Introduction
Some matrices have a hidden simplicity. When viewed in the right coordinate system, they act as pure scalings along independent directions — no mixing, no rotation, just stretching or compressing along each axis. Such matrices are called diagonalizable, and finding this special coordinate system is the process of diagonalization.
Diagonalization transforms a matrix A into a diagonal matrix D through a change of basis. The diagonal entries of D are the eigenvalues of A, and the change-of-basis matrix P has eigenvectors as its columns. This decomposition A=P*D*P(−1) is one of the most powerful tools in linear algebra.
Why does diagonalization matter? Diagonal matrices are trivial to work with: their powers, exponentials, and functions are computed entry by entry. By diagonalizing A, we can easily compute A100, solve differential equations x′=A*x, and understand the long-term behavior of dynamical systems. Diagonalization transforms hard matrix problems into manageable ones.
The Diagonalization Theorem
An n×n matrix A is diagonalizable if and only if it has n linearly independent eigenvectors.
When A is diagonalizable, we can write:
A=P*D*P(−1)
where D is a diagonal matrix with the eigenvalues (λ_1),(λ_2),…,(λ_n) on the diagonal:
D=([(λ_1),0,⋯,0],[0,(λ_2),⋯,0],[⋮,⋮,⋱,⋮],[0,0,⋯,(λ_n)])
and P is the matrix whose columns are the corresponding eigenvectors (v_1),(v_2),…,(v_n)
P=([|,|,,|],[(v_1),(v_2),⋯,(v_n)],[|,|,,|])
The order of eigenvalues in D must match the order of corresponding eigenvectors in P
Why This Works
The equation A=P*D*P(−1) is equivalent to A*P=P*D. Examine this column by column.
The i-th column of A*P is A*(v_i). The i-th column of P*D is (λ_i)*(v_i) (since multiplying P by the diagonal matrix D scales each column by the corresponding diagonal entry).
Thus A*P=P*D states exactly that A*(v_i)=(λ_i)*(v_i) for each i — the definition of eigenvalues and eigenvectors. The matrix P must be invertible, which requires its columns to be linearly independent.
Conditions for Diagonalizability
Sufficient Condition: Distinct Eigenvalues
If A has n distinct eigenvalues, then A is diagonalizable. This is because eigenvectors corresponding to distinct eigenvalues are always linearly independent.
General Condition
When eigenvalues repeat, diagonalizability depends on the eigenspaces. For each eigenvalue λ
• The algebraic multiplicity is the number of times λ appears as a root of the characteristic polynomial.
• The geometric multiplicity is the dimension of the eigenspace null*(A−λ*I)
A matrix is diagonalizable if and only if the geometric multiplicity equals the algebraic multiplicity for every eigenvalue. Equivalently, the sum of geometric multiplicities must equal n
Defective Matrices
A matrix that is not diagonalizable is called defective. For defective matrices, some eigenvalue has geometric multiplicity strictly less than its algebraic multiplicity—there aren't enough independent eigenvectors.
Worked Example
Diagonalize the matrix:
A=([4,1],[2,3])
Step 1: Find Eigenvalues
The characteristic polynomial is:
det(A−λ*I)=(4−λ)*(3−λ)−2=λ2−7*λ+10=(λ−5)*(λ−2)
So (λ_1)=5 and (λ_2)=2
Step 2: Find Eigenvectors
For (λ_1)=5: solving (A−5*I)*v=0 gives (v_1)=([1],[1]).
For (λ_2)=2: solving (A−2*I)*v=0 gives (v_2)=([1],[−2]).
Step 3: Form P and D
P=([1,1],[1,−2]),D=([5,0],[0,2])
Step 4: Verify
We can verify: P(−1)=1/(−3)*([−2,−1],[−1,1])=([2/3,1/3],[1/3,−1/3])
Then P*D*P(−1)=A as required.
Powers of Diagonalizable Matrices
One of the main applications of diagonalization is computing matrix powers. If A=P*D*P(−1) then:
Ak=P*Dk*P(−1)
Since D is diagonal, Dk is simply:
Dk=([(λ_1)k,0,⋯,0],[0,(λ_2)k,⋯,0],[⋮,⋮,⋱,⋮],[0,0,⋯,(λ_n)k])
This makes computing A100 or A1000 straightforward—we just raise each eigenvalue to the appropriate power.
Matrix Exponential
For applications in differential equations, we often need the matrix exponential e(A*t) If A=P*D*P(−1) then:
e(A*t)=P*e(D*t)*P(−1)
where
e(D*t)=([e((λ_1)*t),0,⋯,0],[0,e((λ_2)*t),⋯,0],[⋮,⋮,⋱,⋮],[0,0,⋯,e((λ_n)*t)])
This is essential for solving linear systems of differential equations x′*(t)=A*x*(t).
Connection to Other Concepts
Diagonalization connects deeply to several important topics.
The spectral theorem guarantees that every real symmetric matrix is diagonalizable with an orthonormal basis of eigenvectors. The diagonalizing matrix P is orthogonal (P(−1)=PT, giving the decomposition A=P*D*PT
For matrices that are not diagonalizable, the Jordan normal form provides a generalization. Instead of a diagonal matrix, we get a block-diagonal matrix with Jordan blocks.
The singular value decomposition (SVD) extends diagonalization ideas to rectangular matrices, expressing any matrix as A=U*Σ*VT
Summary
A matrix A is diagonalizable if and only if it has n linearly independent eigenvectors. When diagonalizable, A=P*D*P(−1) where D is diagonal with eigenvalues on the diagonal and P has eigenvectors as columns.
Diagonalization simplifies matrix computations dramatically: Ak=P*Dk*P(−1) and e(A*t)=P*e(D*t)*P(−1) This is the key to solving linear recurrences and differential equations.
A matrix with n distinct eigenvalues is always diagonalizable. With repeated eigenvalues, we must check that geometric multiplicity equals algebraic multiplicity for each eigenvalue.