Skip to main content

Subsection 9.2.4 The Schur and Spectral Decompositions

Practical methods for computing eigenvalues and eigenvectors transform a given matrix into a simpler matrix (diagonal or tridiagonal) via a sequence of transformations that preserve eigenvalues known as similarity transformations.

Definition 9.2.4.1.

Given a nonsingular matrix \(Y \text{,}\) the transformation \(Y^{-1} A Y \) is called a similarity transformation (applied to matrix \(A \)).

Definition 9.2.4.2.

Matrices \(A \) and \(B \) are said to be similar if there exists a nonsingular matrix \(Y \) such that \(B = Y^{-1} A Y \text{.}\)

Homework 9.2.4.1.

Let \(A, B, Y \in \mathbb C^{m \times m} \text{,}\) where \(Y \) is nonsingular, and \(( \lambda, x ) \) an eigenpair of \(A \text{.}\)

Which of the follow is an eigenpair of \(B = Y^{-1} A Y \text{:}\)

  • \(( \lambda, x ) \text{.}\)

  • \(( \lambda, Y^{-1} x ) \text{.}\)

  • \(( \lambda, Y x ) \text{.}\)

  • \(( 1/\lambda, Y^{-1} x ) \text{.}\)

Answer

\(( \lambda, Y^{-1} x ) \text{.}\)

Now justify your answer.

Solution

Since \(A x = \lambda x \) we know that

\begin{equation*} Y^{-1} A Y Y^{-1} x = \lambda Y^{-1} x . \end{equation*}

Hence \(( \lambda, Y^{-1} x ) \) is an eigenpair of \(B \text{.}\)

The observation is that similarity transformations preserve the eigenvalues of a matrix, as summarized in the following theorem.

Let \(\lambda \in \Lambda( A ) \) and \(x \) be an associated eigenvector. Then \(A x = \lambda x \) if and only if \(Y^{-1} A Y Y^{-1} x = Y^{-1} \lambda x \) if and only if \(B ( Y^{-1} x ) = \lambda ( Y^{-1} x ) \text{.}\)

It is not hard to expand the last proof to show that if \(A\) is similar to \(B \) and \(\lambda \in \Lambda( A )\) has algebraic multiplicity \(k\) then \(\lambda \in \Lambda( B ) \) has algebraic multiplicity \(k\text{.}\)

In Subsection 2.2.7, we argued that the application of unitary matrices is desirable, since they preserve length and hence don't amplify error. For this reason, unitary similarity transformations are our weapon of choice when designing algorithms for computing eigenvalues and eigenvectors.

Definition 9.2.4.4.

Given unitary matrix \(Q \) the transformation \(Q^H A Q \) is called a unitary similarity transformation (applied to matrix \(A \)).

The following is a fundamental theorem for the algebraic eigenvalue problem that is key to practical algorithms for finding eigenvalues and eigenvectors.

We will outline how to construct \(Q \) so that \(Q^H A Q = U \text{,}\) an upper triangular matrix.

Since a polynomial of degree \(m \) has at least one root, matrix \(A \) has at least one eigenvalue, \(\lambda_1\text{,}\) and corresponding eigenvector \(q_1 \text{,}\) where we normalize this eigenvector to have length one. Thus \(A q_1 = \lambda_1 q_1 \text{.}\) Choose \(Q_2 \) so that \(Q = \left( \begin{array}{ c|c} q_1 \amp Q_2 \end{array} \right) \) is unitary. Then

\begin{equation*} \begin{array}{l} Q^H A Q \\ ~~~=~~~~ \\ \left( \begin{array}{ c|c} q_1 \amp Q_2 \end{array} \right)^H A \left( \begin{array}{ c|c} q_1 \amp Q_2 \end{array} \right) \\ ~~~=~~~~ \\ \left( \begin{array}{ c|c} q_1^H A q_1 \amp q_1^H A Q_2 \\ \hline Q_2^H A q_1 \amp Q_2^H A Q_2 \end{array} \right) \\ ~~~=~~~~ \\ \left( \begin{array}{ c|c} \lambda_1 \amp q_1^H A Q_2 \\ \hline \lambda Q_2^H q_1 \amp Q_2^H A Q_2 \end{array} \right) \\ ~~~=~~~~ \\ \left( \begin{array}{ c|c} \lambda_1 \amp w^T \\ \hline 0 \amp B \end{array} \right), \end{array} \end{equation*}

where \(w^T = q_1^H A Q_2 \) and \(B = Q_2^H A Q_2 \text{.}\) This insight can be used to construct an inductive proof.

In other words: Given matrix \(A \text{,}\) there exists a unitary matrix \(Q \) such that applying the unitary similarity transformation \(Q^H A Q \) yields an upper triangular matrix \(U \text{.}\) Since then \(\Lambda( A ) = \Lambda( U ) \text{,}\) the eigenvalues of \(A \) can be found on the diagonal of \(U \text{.}\) The eigenvectors of \(U \) can be computed and from those the eigenvectors of \(A \) can be recovered.

One should not mistake the above theorem and its proof for a constructive way to compute the Schur decomposition: finding an eigenvalue, \(\lambda_1 \) and/or the eigenvector associated with it, \(q_1 \text{,}\) is difficult. Also, completing the unitary matrix \(\left( \begin{array}{c | c } q_1 \amp Q_2 \end{array} \right) \) is expensive (requiring the equivalent of a QR factorization).

Homework 9.2.4.2.

Let \(A \in \C^{m \times m} \text{,}\) \(A = Q U Q^H \) be its Schur decomposition, and \(X^{-1} U X = \Lambda \text{,}\) where \(\Lambda \) is a diagonal matrix and \(X \) is nonsingular.

  • How are the elements of \(\Lambda \) related to the elements of \(U \text{?}\)

  • How are the columns of \(X \) related to the eigenvectors of \(A \text{?}\)

Solution
  • How are the elements of \(\Lambda \) related to the elements of \(U \text{?}\)

    The diagonal elements of \(U \) equal the diagonal elements of \(\Lambda \text{.}\)

  • How are the columns of \(X \) related to the eigenvectors of \(A \text{?}\)

    \begin{equation*} A = Q U Q^H = Q X \Lambda X^{-1} Q^H = ( Q X ) \Lambda ( Q X )^{-1}. \end{equation*}

    Hence the columns of \(Q X \) equal eigenvectors of \(A \text{.}\)

If the matrix is Hermitian, then the Schur decomposition has the added property that \(U \) is diagonal. The resulting decomposition is known as the Spectral decomposition.

Let \(A = Q U Q^H \) be the Schur decomposition of \(A \text{.}\) Then \(U = Q^H A Q \text{.}\) Since \(A \) is Hermitian, so is \(U \) since \(U^H = ( Q^H A Q )^H = Q^H A^H Q = Q^H A Q = U \text{.}\) A triangular matrix that is Hermitian is diagonal. Any Hermitian matrix has a real-valued diagonal and hence \(D\) has real-valued on its diagonal.

In practical algorithms, it will often occur that an intermediate result can be partitioned into smaller subproblems. This is known as deflating the problem and builds on the following insights.

The proof of the above theorem follows from the next homework regarding how the Schur decomposition of \(A \) can be computed from the Schur decompositions of \(A_{TL} \) and \(A_{BR} \text{.}\)

Homework 9.2.4.3.

Let \(A \in \C^{m \times m} \) be of form

\begin{equation*} A = \FlaTwoByTwo{ A_{TL} }{ A_{TR} }{ 0 }{ A_{BR} } , \end{equation*}

where \(A_{TL} \) and \(A_{BR} \) are square submatrices with Schur decompositions

\begin{equation*} A_{TL} = Q_{TL} U_{TL} Q_{TL}^H \mbox{ and } A_{BR} = Q_{BR} U_{BR} Q_{BR}^H. \end{equation*}

Give the Schur decomposition of \(A \text{.}\)

Solution
\begin{equation*} \begin{array}{l} A \\ ~~~ = ~~~~ \\ \FlaTwoByTwo{ A_{TL} }{ A_{TR} }{ 0 }{ A_{BR} } \\ ~~~ = ~~~~ \\ \FlaTwoByTwo{ Q_{TL} U_{TL} Q_{TL}^H }{ A_{TR} } { 0 }{ Q_{BR} U_{BR} Q_{BR}^H } \\ ~~~ = ~~~~ \\ \begin{array}[t]{c} \underbrace{ \FlaTwoByTwo{ Q_{TL} }{ 0 } { 0 }{ Q_{BR} } } \\ Q \end{array} \begin{array}[t]{c} \underbrace{ \FlaTwoByTwo{ U_{TL} }{ Q_{TL}^H A_{TR} Q_{BR} } { 0 }{ U_{BR} } } \\ U \end{array} \begin{array}[t]{c} \underbrace{ \FlaTwoByTwo{ Q_{TL} }{ 0 } { 0 }{ Q_{BR} }^H } \\ Q^H \end{array} \end{array} \end{equation*}
Homework 9.2.4.4.

Generalize the result in the last homework for block upper triangular matrices:

\begin{equation*} A = \left( \begin{array}{c | c | c | c } A_{0,0} \amp A_{0,1} \amp \cdots \amp A_{0,N-1} \\ \hline 0\amp A_{1,1} \amp \cdots \amp A_{1,N-1} \\ \hline 0 \amp 0 \amp \ddots \amp \vdots \\ \hline 0 \amp 0 \amp \cdots \amp A_{N-1,N-1} \end{array} \right). \end{equation*}
Solution

For \(i = 0, \ldots , N-1 \text{,}\) let \(A_{i.i} = Q_i U_{i} Q_i^H \) be the Schur decomposition of \(A_{i,i} \text{.}\) Then

\begin{equation*} \begin{array}{l} A\\ ~~~= ~~~~ \\ \left( \begin{array}{c | c | c | c } A_{0,0} \amp A_{0,1} \amp \cdots \amp A_{0,N-1} \\ \hline 0\amp A_{1,1} \amp \cdots \amp A_{1,N-1} \\ \hline 0 \amp 0 \amp \ddots \amp \vdots \\ \hline 0 \amp 0 \amp \cdots \amp A_{N-1,N-1} \end{array} \right) \\ ~~~= ~~~~ \\ \left( \begin{array}{c | c | c | c } Q_0 U_{0} Q_0^H\amp A_{0,1} \amp \cdots \amp A_{0,N-1} \\ \hline 0\amp Q_1 U_{1} Q_1^H\amp \cdots \amp A_{1,N-1} \\ \hline 0 \amp 0 \amp \ddots \amp \vdots \\ \hline 0 \amp 0 \amp \cdots \amp Q_{N-1} U_{N-1} Q_{N-1}^H \end{array} \right) \\ ~~~= ~~~~ \\ \left( \begin{array}{c | c | c | c } Q_0 \amp 0 \amp \cdots \amp 0 \\ \hline 0\amp Q_1 \amp \cdots \amp 0 \\ \hline 0 \amp 0 \amp \ddots \amp \vdots \\ \hline 0 \amp 0 \amp \cdots \amp Q_{N-1} \end{array} \right) \\ ~~~~~~~~~~~ \left( \begin{array}{c | c | c | c } U_{0} \amp Q_0^H A_{0,1} Q_1\amp \cdots \amp Q_0^H A_{0,N-1} Q_{N-1}\\ \hline 0\amp U_{1} \amp \cdots \amp Q_1^H A_{1,N-1} Q_{N-1}\\ \hline 0 \amp 0 \amp \ddots \amp \vdots \\ \hline 0 \amp 0 \amp \cdots \amp U_{N-1} \end{array} \right) \\ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \left( \begin{array}{c | c | c | c } Q_0 \amp 0 \amp \cdots \amp 0 \\ \hline 0\amp Q_1 \amp \cdots \amp 0 \\ \hline 0 \amp 0 \amp \ddots \amp \vdots \\ \hline 0 \amp 0 \amp \cdots \amp Q_{N-1} \end{array} \right)^H. \end{array} \end{equation*}