Skip to main content

Subsection 10.2.4 Deflating the problem

Recall that if

\begin{equation*} A = \left( \begin{array}{c | c} A_{00} \amp 0 \\ \hline 0 \amp A_{11} \end{array} \right), A_{00} x = \lambda x, \mbox{ and } A_{11} y = \mu y, \end{equation*}

then

\begin{equation*} \left( \begin{array}{c | c} A_{00} \amp 0 \\ \hline 0 \amp A_{11} \end{array} \right) \left( \begin{array}{c} x \\ \hline 0 \end{array} \right) = \lambda \left( \begin{array}{c} x \\ \hline 0 \end{array} \right) \mbox{ and } \left( \begin{array}{c | c} A_{00} \amp 0 \\ \hline 0 \amp A_{11} \end{array} \right) \left( \begin{array}{c} 0 \\ \hline y \end{array} \right) = \mu \left( \begin{array}{c} 0 \\ \hline y \end{array} \right) . \end{equation*}

In other words, \(\Lambda( A ) = \Lambda( A_{00} ) \cup \Lambda( A_{11} ) \) and eigenvectors of \(A \) can be easily constructed from eigenvalues of \(A_{00} \) and \(A_{11} \text{.}\)

This insight allows us to deflate a matrix when stategically placed zeroes (or, rather, acceptably small entries) appear as part of the QR algorithm. Let us continue to focus on the Hermitian eigenvalue problem.

Homework 10.2.4.1.

Let \(A \in \Cmxm \) be a Hermitian matrix and \(V \in \Cmxm \) be a unitary matrix such that

\begin{equation*} V^H A V = \left( \begin{array}{c | c} A_{00} \amp 0 \\ \hline 0 \amp A_{11} \end{array} \right) \end{equation*}

If \(V_{00} \) and \(V_{11} \) are unitary matrices such that \(V_{00}^H A_{00} V_{00} = \Lambda_0 \) and \(V_{11}^H A_{11} V_{11} = \Lambda_1 \text{,}\) are both diagonal, show that

\begin{equation*} \left( V \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right) \right)^H A \left( V \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right) \right) = \left( \begin{array}{c | c} \Lambda_{0} \amp 0 \\ \hline 0 \amp \Lambda_{1} \end{array} \right) . \end{equation*}
Solution
\begin{equation*} \begin{array}{l} \left( V \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right) \right)^H A \left( V \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right) \right) \\ ~~~ = ~~~~ \lt ( X Y )^H = Y^H X^H \gt \\ \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right)^H V^H A V \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right) \\ ~~~ = ~~~~ \lt ( V^H A V = \diag( A_{00}, A_{11} ) \gt \\ \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right)^H \left( \begin{array}{c | c} A_{00} \amp 0 \\ \hline 0 \amp A_{11} \end{array} \right) \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right) \\ ~~~ = ~~~~ \lt \mbox{ partitioned matrix-matrix multiplication } \gt \\ \left( \begin{array}{c | c} V_{00}^H A_{00} V_{00} \amp 0 \\ \hline 0 \amp V_{11}^H A_{11} V_{11} \end{array} \right) \\ ~~~ = ~~~~ \lt V_{00}^H A_{00} V_{00} = \Lambda_0 ; V_{11}^H A_{11} V_{11} = \Lambda_1 \gt \\ \left( \begin{array}{c | c} \Lambda_{0} \amp 0 \\ \hline 0 \amp \Lambda_{1} \end{array} \right) \end{array} \end{equation*}

The point of this last exercise is that if at some point the QR algorithm yields a block diagonal matrix, then the algorithm can proceed to find the spectral decompositions of the blocks on the diagonal, updating the matrix in which the eigenvalues are accumulated.

Now, it is the last colum of \(V^{(k)} \) that converges fastest to an eigenvector, eventually we expect \(A^{(k)} \) computed as part of the QR algorithm to be of the form

\begin{equation*} A^{(k)} = \left( \begin{array}{c | c} A_{00}^{(k)}\amp f_{01}^{(k)} \\ \hline {f_{01}^{(k)}~}^T \amp \alpha_{m,m}^{(k)} \end{array} \right) , \end{equation*}

where \(f_{01}^{(k)} \) is small. In other words,

\begin{equation*} A^{(k)} \approx \left( \begin{array}{c | c} A_{00}^{(k)}\amp 0 \\ \hline 0 \amp \alpha_{m,m}^{(k)} \end{array} \right) . \end{equation*}

Once \(f_{01}^{(k)} \) is small enough, the algorithm can continue with \(A_{00}^{(k)} \text{.}\) The problem is thus deflated to a smaller problem.

It is possible that deflation can happen anywhere in the matrix. However, it is most likely to happen in the last row and column of the active part of the matrix.

A question is what criteria to use to deflate. If the active matrix is \(m \times m \text{,}\) we will use the criteria

\begin{equation*} \| f_{01} \|_\infty \leq \epsilon_{\rm mach} \left( \vert \alpha_{m-1,m-1} \vert + \vert \alpha_{m,m} \vert \right). \end{equation*}

The idea is that if the magnitudes of the offdiagonal elements of the last row are small relative to the eigenvalues that determine the rate of convergence, then they can be considered to be zero. We will return to this criteria later.