ALAFF Deflating the problem

Subsection 10.2.3 Deflating the problem

Recall that if

\begin{equation*} A = \left( \begin{array}{c | c} A_{00} \amp 0 \\ \hline 0 \amp A_{11} \end{array} \right), A_{00} x = \lambda x, \mbox{ and } A_{11} y = \mu y, \end{equation*}

then

\begin{equation*} \left( \begin{array}{c | c} A_{00} \amp 0 \\ \hline 0 \amp A_{11} \end{array} \right) \left( \begin{array}{c} x \\ \hline 0 \end{array} \right) = \lambda \left( \begin{array}{c} x \\ \hline 0 \end{array} \right) \mbox{ and } \left( \begin{array}{c | c} A_{00} \amp 0 \\ \hline 0 \amp A_{11} \end{array} \right) \left( \begin{array}{c} 0 \\ \hline y \end{array} \right) = \mu \left( \begin{array}{c} 0 \\ \hline y \end{array} \right) . \end{equation*}

In other words, \(\Lambda( A ) = \Lambda( A_{00} ) \cup \Lambda( A_{11} ) \) and eigenvectors of \(A \) can be easily constructed from eigenvalues of \(A_{00} \) and \(A_{11} \text{.}\)

This insight allows us to deflate a matrix when stategically placed zeroes (or, rather, acceptably small entries) appear as part of the QR algorithm. Let us continue to focus on the Hermitian eigenvalue problem.

Homework 10.2.3.1.

Let \(A \in \Cmxm \) be a Hermitian matrix and \(V \in \Cmxm \) be a unitary matrix such that

\begin{equation*} V^H A V = \left( \begin{array}{c | c} A_{00} \amp 0 \\ \hline 0 \amp A_{11} \end{array} \right) . \end{equation*}

If \(V_{00} \) and \(V_{11} \) are unitary matrices such that \(V_{00}^H A_{00} V_{00} = \Lambda_0 \) and \(V_{11}^H A_{11} V_{11} = \Lambda_1 \text{,}\) are both diagonal, show that

\begin{equation*} \left( V \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right) \right)^H A \left( V \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right) \right) = \left( \begin{array}{c | c} \Lambda_{0} \amp 0 \\ \hline 0 \amp \Lambda_{1} \end{array} \right) . \end{equation*}

Solution

\begin{equation*} \begin{array}{l} \left( V \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right) \right)^H A \left( V \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right) \right) \\ ~~~ = ~~~~ \lt ( X Y )^H = Y^H X^H \gt \\ \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right)^H V^H A V \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right) \\ ~~~ = ~~~~ \lt ( V^H A V = \diag{ A_{00}, A_{11} } \gt \\ \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right)^H \left( \begin{array}{c | c} A_{00} \amp 0 \\ \hline 0 \amp A_{11} \end{array} \right) \left( \begin{array}{c | c} V_{00} \amp 0 \\ \hline 0 \amp V_{11} \end{array} \right) \\ ~~~ = ~~~~ \lt \mbox{ partitioned matrix-matrix multiplication } \gt \\ \left( \begin{array}{c | c} V_{00}^H A_{00} V_{00} \amp 0 \\ \hline 0 \amp V_{11}^H A_{11} V_{11} \end{array} \right) \\ ~~~ = ~~~~ \lt V_{00}^H A_{00} V_{00} = \Lambda_0 ; V_{11}^H A_{11} V_{11} = \Lambda_1 \gt \\ \left( \begin{array}{c | c} \Lambda_{0} \amp 0 \\ \hline 0 \amp \Lambda_{1} \end{array} \right) . \end{array} \end{equation*}

The point of this last exercise is that if at some point the QR algorithm yields a block diagonal matrix, then the algorithm can proceed to find the spectral decompositions of the blocks on the diagonal, updating the matrix, \(V \text{,}\) in which the eigenvectors are accumulated.

Now, since it is the last colum of \(V^{(k)} \) that converges fastest to an eigenvector, eventually we expect \(A^{(k)} \) computed as part of the QR algorithm to be of the form

\begin{equation*} A^{(k)} = \left( \begin{array}{c | c} A_{00}^{(k)}\amp f_{01}^{(k)} \\ \hline {f_{01}^{(k)}~}^T \amp \alpha_{m-1,m-1}^{(k)} \end{array} \right) , \end{equation*}

where \(f_{01}^{(k)} \) is small. In other words,

\begin{equation*} A^{(k)} \approx \left( \begin{array}{c | c} A_{00}^{(k)}\amp 0 \\ \hline 0 \amp \alpha_{m-1,m-1}^{(k)} \end{array} \right) . \end{equation*}

Once \(f_{01}^{(k)} \) is small enough, the algorithm can continue with \(A_{00}^{(k)} \text{.}\) The problem is thus deflated to a smaller problem.

What criteria should we use to deflate. If the active matrix is \(m \times m \text{,}\) for now we use the criteria

\begin{equation*} \| f_{01} \|_1 \leq \epsilon_{\rm mach} ( \vert \alpha_{0,0}^{(k)} \vert + \cdots + \vert \alpha_{m-1,m-1}^{(k)} \vert). \end{equation*}

The idea is that if the magnitudes of the off-diagonal elements of the last row are small relative to the eigenvalues, then they can be considered to be zero. The sum of the absolute values of the diagonal elements is an estimate of the sizes of the eigenvalues. We will refine this criteria later.

Homework 10.2.3.2.

Copy SimpleShiftedQRAlg.m into SimpleShiftedQRAlgWithDeflation.m and modify it to add deflation.

function [ Ak, V ] = SimpleShiftedQRAlgWithDeflation( A, maxits, illustrate, delay )

Modify the appropriate lines in Assignments/Week10/matlab/test_simple_QR_algorithm.m, changing (0) to (1), and use it to examine the convergence of the method.

Solution

Assignments/Week10/answers/SimpleShiftedQRAlgWithDeflation.m

Discuss what you observe online with others!

Remark 10.2.3.1.

It is possible that deflation can happen anywhere in the matrix and one should check for that. However, it is most likely to happen in the last row and column of the active part of the matrix.