ALAFF Diagonalizing a matrix

Subsection 9.2.5 Diagonalizing a matrix

The algebraic eigenvalue problem or, more simply, the computation of eigenvalues and eigenvectors is often presented as the problem of diagonalizing a matrix. We make that link in this unit.

Definition 9.2.5.1. Diagonalizable matrix.

A matrix \(A \in \Cmxm \) is said to be diagonalizable if and only if there exists a nonsingular matrix \(X \) and diagonal matrix \(D \) such that \(X^{-1} A X = D \text{.}\)

Why is this important? Consider the equality \(w = A v.\) Notice that we can write \(w \) as a linear combination of the columns of \(X \text{:}\)

\begin{equation*} w = X \begin{array}[t]{c} \underbrace{ (X^{-1} w) } \\ \widetilde w \end{array}. \end{equation*}

In other words, \(X^{-1} w \) is the vector of coefficients when \(w \) is writen in terms of the basis that consists of the columns of \(X \text{.}\) Similarly, we can write \(v \) as a linear combination of the columns of \(X \text{:}\)

\begin{equation*} v = X \begin{array}[t]{c} \underbrace{ (X^{-1} v)} \\ \widetilde v \end{array}. \end{equation*}

Now, since \(X \) is nonsingular, \(w = A v\) is equivalent to \(X^{-1} w = X^{-1} A X X^{-1} v ,\) and hence \(X^{-1} w = D ( X^{-1} v ).\)

Remark 9.2.5.2.

We conclude that if we view the matrices in the right basis (namely the basis that consists of the columns of \(X \)), then the transformation \(w := A v \) simplifies to \(\widetilde w := D \widetilde v \text{.}\) This is a really big deal.

How is diagonalizing a matrix related to eigenvalues and eigenvectors? Let's assume that \(X^{-1} A X = D \text{.}\) We can rewrite this as

\begin{equation*} A X = X D \end{equation*}

and partition

\begin{equation*} A \left( \begin{array}{c | c | c | c } x_0 \amp x_1 \amp \cdots \amp x_{m-1} \end{array} \right) = \left( \begin{array}{c | c | c | c } x_0 \amp x_1 \amp \cdots \amp x_{m-1} \end{array} \right) \left( \begin{array}{c | c | c | c } \delta_0 \amp 0 \amp \cdots \amp 0 \\ \hline 0 \amp \delta_1 \amp \cdots \amp 0 \\ \hline \vdots \amp \vdots \amp \ddots \amp \vdots \\ \hline 0 \amp 0 \amp \cdots \amp \delta_{m-1} \end{array} \right) . \end{equation*}

Multiplying this out yields

\begin{equation*} \left( \begin{array}{c | c | c | c } A x_0 \amp A x_1 \amp \cdots \amp A x_{m-1} \end{array} \right) = \left( \begin{array}{c | c | c | c } \delta_0 x_0 \amp \delta_1 x_1 \amp \cdots \amp \delta_{m-1} x_{m-1} \end{array} \right). \end{equation*}

We conclude that

\begin{equation*} A x_j = \delta_j x_j \end{equation*}

which means that the entries on the diagonal of \(D \) are the eigenvalues of \(A \) and the corresponding eigenvectors are found as columns of \(X \text{.}\)

Homework 9.2.5.1.

In Homework 9.2.3.3, we computed the eigenvalues and corresponding eigenvectors of

\begin{equation*} A = \left(\begin{array}{rrr} -2 \amp 3 \amp -7 \\ 0 \amp 1 \amp 1 \\ 0 \amp 0 \amp 2 \\ \end{array}\right) . \end{equation*}

Use the answer to that question to give a matrix \(X \) such that \(X^{-1} A X = \Lambda \text{.}\) Check that \(A X = X \Lambda \text{.}\)

Solution

The eigenpairs computed for Homework 9.2.3.3 were

\begin{equation*} ( -2, \left( \begin{array}{c} 1 \\ 0 \\ 0 \end{array} \right) ) , ( 1, \left( \begin{array}{r} 1 \\ 1 \\ 0 \end{array} \right) ) , \mbox{ and } ( 2, \left( \begin{array}{r} -1 \\ 1 \\ 1 \end{array} \right) ). \end{equation*}

Hence

\begin{equation*} \left(\begin{array}{rrr} 1 \amp 1 \amp -1 \\ 0 \amp 1 \amp 1 \\ 0 \amp 0 \amp 1 \\ \end{array}\right)^{-1} \left(\begin{array}{rrr} -2 \amp 3 \amp -7 \\ 0 \amp 1 \amp 1 \\ 0 \amp 0 \amp 2 \\ \end{array}\right) \left(\begin{array}{rrr} 1 \amp 1 \amp -1 \\ 0 \amp 1 \amp 1 \\ 0 \amp 0 \amp 1 \\ \end{array}\right) = \left(\begin{array}{rrr} -2 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 2 \\ \end{array}\right) . \end{equation*}

We can check this:

\begin{equation*} \begin{array}[t]{c} \underbrace{ \left(\begin{array}{rrr} -2 \amp 3 \amp -7 \\ 0 \amp 1 \amp 1 \\ 0 \amp 0 \amp 2 \\ \end{array}\right) \left(\begin{array}{rrr} 1 \amp 1 \amp -1 \\ 0 \amp 1 \amp 1 \\ 0 \amp 0 \amp 1 \\ \end{array}\right) } \\ \left(\begin{array}{rrr} -2 \amp 1 \amp -2 \\ 0 \amp 1 \amp 2 \\ 0 \amp 0 \amp 2 \\ \end{array}\right) \end{array} = \begin{array}[t]{c} \underbrace{ \left(\begin{array}{rrr} 1 \amp 1 \amp -1 \\ 0 \amp 1 \amp 1 \\ 0 \amp 0 \amp 1 \\ \end{array}\right) \left(\begin{array}{rrr} -2 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 2 \\ \end{array}\right) . } \\ \left(\begin{array}{rrr} -2 \amp 1 \amp -2 \\ 0 \amp 1 \amp 2 \\ 0 \amp 0 \amp 2 \\ \end{array}\right) \end{array} \end{equation*}

Now assume that the eigenvalues of \(A \in \Cmxm \) are given by \(\{ \lambda_0, \lambda_1, \ldots , \lambda_{m-1} \} \text{,}\) where eigenvalues are repeated according to their algebraic multiplicity. Assume that there are \(m \) linearly independent vectors \(\{ x_0, x_1, \ldots, x_{m-1}\}\) such that \(A x_j = \lambda_j x_j \text{.}\) Then

\begin{equation*} A \left( \begin{array}{c | c | c | c } x_0 \amp x_1 \amp \cdots \amp x_{m-1} \end{array} \right) = \left( \begin{array}{c | c | c | c } x_0 \amp x_1 \amp \cdots \amp x_{m-1} \end{array} \right) \left( \begin{array}{c | c | c | c } \lambda_0 \amp 0 \amp \cdots \amp 0 \\ \hline 0 \amp \lambda_1 \amp \cdots \amp 0 \\ \hline \vdots \amp \vdots \amp \ddots \amp \vdots \\ \hline 0 \amp 0 \amp \cdots \amp \lambda_{m-1} \end{array} \right). \end{equation*}

Hence, if \(X = \left( \begin{array}{c | c | c | c } x_0 \amp x_1 \amp \cdots \amp x_{m-1} \end{array} \right) \) and \(D = \diag{ \lambda_0, \lambda_1, \ldots, \lambda_{m-1} } \) then \(X^{-1} A X = D \text{.}\) In other words, if \(A \) has \(m \) linearly independent eigenvectors, then \(A\) is diagonalizable.

These insights are summarized in the following theorem:

Theorem 9.2.5.3.

A matrix \(A \in \Cmxm \) is diagonalizable if and only if it has \(m \) linearly independent eigenvectors.

Here are some classes of matrices that are diagonalizable:

Diagonal matrices.

If \(A \) is diagonal, then choosing \(X = I \) and \(A = D \) yields \(X^{-1} A X= D \text{.}\)
Hermitian matrices.

If \(A \) is Hermitian, then the spectral decomposition theorem tells us that there exists unitary matrix \(Q \) and diagonal matrix \(D \) such that \(A = Q D Q^H \text{.}\) Choosing \(X = Q \) yields \(X^{-1} A X= D \text{.}\)
Triangular matrices with distinct diagonal elements.

If \(U \) is upper triangular and has distinct diagonal elements, then by Homework 9.2.3.4 we know we can find an eigenvector associated with each diagonal element and by design those eigenvectors are linearly independent. Obviously, this can be extended to lower triangular matrices as well.

Homework 9.2.5.2.

Let \(A \in \Cmxm \) have distinct eigenvalues.

ALWAYS/SOMETIMES/NEVER: \(A \) is diagonalizable.

Answer

ALWAYS

Now prove it!

Solution

Let \(A = Q U Q^H \) be the Schur decomposition of matrix \(A \text{.}\) Since \(U \) is upper triangular, and has the same eigenvalues as \(A \text{,}\) it has distinct entries along its diagonal. Hence, by our earlier observations, there exists a nonsingular matrix \(X \) such that \(X^{-1} U X = D \text{,}\) a diagonal matrix. Now,

\begin{equation*} X^{-1} Q^H A Q X = X^{-1} U X = D \end{equation*}

and hence \(Y = Q X \) is the nonsingular matrix that diagonalizes \(A \text{.}\)