ALAFF A simple shifted QR algorithm

Subsection 10.2.2 A simple shifted QR algorithm

The equivalence of the subspace iteration and the QR algorithm tells us a lot about convergence. Under mild conditions \((\vert \lambda_0 \vert \geq \cdots \geq \vert \lambda_{n-1} \vert \gt \vert \lambda_n \vert \gt \cdots \vert \lambda_{m-1} \vert)\text{,}\)

The first \(n \) columns of \(V^{(k)} \) converge to a basis for the subspace spanned by the eigenvectors associated with \(\lambda_0, \ldots, \lambda_{n-1} \text{.}\)
The last \(m-n \) columns of \(V^{(k)} \) converge to the subspace orthogonal to the subspace spanned by the eigenvectors associated with \(\lambda_0, \ldots, \lambda_{n-1} \text{.}\)
If \(A \) is Hermitian, then the eigenvectors associated with \(\lambda_0, \ldots, \lambda_{n-1} \text{,}\) are orthogonal to those associated with \(\lambda_n, \ldots, \lambda_{m-1} \text{.}\) Hence, the subspace spanned by the eigenvectors associated with \(\lambda_0, \ldots, \lambda_{n-1} \) is orthogonal to the subspace spanned by the eigenvectors associated with \(\lambda_n, \ldots, \lambda_{m-1} \text{.}\)
The rate of convergence with which these subspaces become orthogonal to each other is linear with a constant \(\vert \lambda_n \vert / \vert \lambda_{n-1} \vert \text{.}\)

What if in this situation we focus on \(n = m-1 \text{?}\) Then

The last column of \(V^{(k)} \) converges to point in the direction of the eigenvector associated with \(\lambda_{m-1} \text{,}\) the smallest in magnitude.
The rate of convergence of that vector is linear with a constant \(\vert \lambda_{m-1} \vert / \vert \lambda_{m-2} \vert \text{.}\)

In other words, the subspace iteration acts upon the last column of \(V^{(k)} \) in the same way as would an inverse iteration. This observation suggests that that convergence can be greatly accelerated by shifting the matrix by an estimate of the eigenvalue that is smallest in magnitude.

Homework 10.2.2.1.

Copy SimpleQRAlg.m into SimpleShiftedQRAlgConstantShift.m and modify it to implement an algorithm that executes the QR algorithm with a shifted matrix \(A - \rho I \text{:}\)

function [ Ak, V ] = SimpleShiftedQRAlgConstantShift( A, rho, maxits, illustrate, delay )

Modify the appropriate line in Assignments/Week10/matlab/test_simple_QR_algorithms.m, changing (0) to (1), and use it to examine the convergence of the method.

Try different values for \(\rho \text{:}\) \(0.0 \text{,}\) \(0.9 \text{,}\) \(0.99 \text{,}\) \(1.0 \text{,}\) \(1.99 \text{,}\) \(1.5 \text{.}\) What do you observe?

Solution

Assignments/Week10/answers/SimpleShiftedQRAlgConstantShift.m

Discuss what you observe online with others!

We could compute the Rayleigh quotient with the last column of \(V^{(k)} \) but a moment of reflection tells us that that estimate is already available as the last element on the diagonal of \(A^{(k)} \text{,}\) because the diagonal elements of \(A^{(k)}\) converge to the eigenvalues. Thus, we arrive upon a simple shifted QR algorithm in Figure 10.2.2.1. This algorithm inherits the cubic convergence of the Rayleigh quotient iteration, for the last column of \(V \text{.}\)

\begin{equation*} \begin{array}{l} V = I \\ {\bf for~} k:=0, \ldots \\ ~~~( Q, R ) := {\rm QR}( A - \alpha_{m-1,m-1} I ) \\ ~~~ A = R Q + \alpha_{m-1,m-1} I \\ ~~~ V = V Q \\ {\bf endfor} \\ \end{array} \end{equation*}

Figure 10.2.2.1. Simple shifted QR algorithm.

To more carefully examine this algorithm, let us annotate it as we did for the simple QR algorithm in the last unit.

\begin{equation*} \begin{array}{l} A^{(0)} = A \\ V^{(0)} = I \\ R^{(0)} = I \\ {\bf for~} k:=0, \ldots \\ ~~~\mu_k = \alpha_{m-1,m-1}^{(k)} \\ ~~~( Q^{(k+1)}, R^{(k+1)} ) := {\rm QR}( A^{(k)} - \mu_k I ) \\ ~~~ A^{(k+1)} = R^{(k+1)} Q^{(k+1)} + \mu_k I \\ ~~~ V^{(k+1)} = V^{(k)} Q^{(k+1)} \\ {\bf endfor} \end{array} \end{equation*}

The following exercises clarify some of the finer points.

Homework 10.2.2.2.

For the above algorithm, show that

\(A^{(k+1)} = Q^{(k+1)\,H} A^{(k)} Q^{(k+1)} \text{.}\)
\(A^{(k+1)} = V^{(k+1)\,H} A^{(0)} V^{(k+1)} \text{.}\)

Solution

n The algorithm computes the QR factorization of \(A^{(k)} - \mu_k I \)

\begin{equation*} A^{(k)} - \mu_k I = Q^{(k+1)} R^{(k+1)} \end{equation*}

after which

\begin{equation*} A^{(k+1)} := R^{(k+1)} Q^{(k+1)} + \mu_k I \end{equation*}

Hence

\begin{equation*} \begin{array}{l} A^{(k+1)} \\ ~~~=~~~~ \\ R^{(k+1)} Q^{(k+1)} + \mu_k I \\ ~~~=~~~~ \\ {Q^{(k+1)}~}^H (A^{(k)}-\mu_k I ) Q^{(k+1)} + \mu_kI \\ ~~~=~~~~ \\ {Q^{(k+1)}~}^H A^{(k)} Q^{(k+1)}-\mu_k {Q^{(k+1)}~}^H Q^{(k+1)} + \mu_kI \\ ~~~=~~~~ \\ {Q^{(k+1)}~}^H A^{(k)} Q^{(k+1)}. \end{array} \end{equation*}

The second part can be easily proven via induction.

This last exercise confirms that the eigenvalues of \(A^{(k)} \) equal the eigenvalues of \(A \text{.}\)

Homework 10.2.2.3.

For the above algorithm, show that

\begin{equation*} \begin{array}{l} ( A - \mu_{k-1} I) ( A - \mu_{k-2} I) \cdots ( A - \mu_1 I) ( A - \mu_0 I) \\ ~~~~~= \begin{array}[t]{c} \underbrace{ Q^{(0)} Q^{(1)} \cdots Q^{(k)} } \\ \mbox{unitary} \end{array} \begin{array}[t]{c} \underbrace{ R^{(k)} \cdots R^{(1)} R^{(0)}. } \\ \mbox{upper triangular} \end{array} \end{array} \end{equation*}

Solution

In this problem, we need to assume that \(Q^{(0)} = I \text{.}\) Also, it helps to recognize that \(V^{(k)} = Q^{(0)} \cdots Q^{(k)} \text{,}\) which can be shown via a simple inductive proof.

This requires a proof by induction.

Base case: \(k = 1 \text{.}\)

\begin{equation*} \begin{array}{l} A - \mu_0 I \\ ~~~=~~~~ \lt A^{(0)} = A \gt \\ A^{(0)} - \mu_0 I \\ ~~~=~~~~ \lt \mbox{ algorithm } \gt \\ Q^{(1)} R^{(1)} \\ ~~~=~~~~ \lt Q^{(0)} = R^{(0)} = I \\ Q^{(0)} Q^{(1)} R^{(1)} R^{(0)} \end{array} \end{equation*}
Inductive Step: Assume

\begin{equation*} \begin{array}{l} ( A - \mu_{k-1} I) ( A - \mu_{k-2} I) \cdots ( A - \mu_1 I) ( A - \mu_0 I) \\ ~~~~~= \begin{array}[t]{c} \underbrace{ Q^{(0)} Q^{(1)} \cdots Q^{(k)} } \\ V^{(k)} \end{array} \begin{array}[t]{c} \underbrace{ R^{(k)} \cdots R^{(1)} R^{(0)}. } \\ \mbox{upper triangular} \end{array} \end{array} \end{equation*}

Show that

\begin{equation*} \begin{array}{l} ( A - \mu_{k} I) ( A - \mu_{k-1} I) \cdots ( A - \mu_1 I) ( A - \mu_0 I) \\ ~~~~~= \begin{array}[t]{c} \underbrace{ Q^{(0)} Q^{(1)} \cdots Q^{(k+1)} } \\ V^{(k+1)} \end{array} \begin{array}[t]{c} \underbrace{ R^{(k+1)} \cdots R^{(1)} R^{(0)}. } \\ \mbox{upper triangular} \end{array} \end{array} \end{equation*}

Notice that

\begin{equation*} \begin{array}{l} ( A - \mu_k I )( A - \mu_{k-1} I ) \cdots ( A - \mu_{0} I ) \\ ~~~=~~~~ \lt \mbox{ last homework } \gt \\ ( V^{(k+1)} A^{(k+1)} V^{(k+1)\,H} - \mu_k I ) ( A - \mu_{k-1} I ) \cdots ( A - \mu_{0} I ) \\ ~~~=~~~~ \lt I = V^{(k+1)} V^{(k+1)\,H} \gt \\ V^{(k+1)} ( A^{(k+1)} - \mu_k I ) V^{(k+1)\,H}( A - \mu_{k-1} I ) \cdots ( A - \mu_{0} I ) \\ ~~~=~~~~ \lt \mbox{ I.H. } \gt \\ V^{(k+1)} ( A^{(k+1)} - \mu_k I ) V^{(k+1)\,H} V^{(k)} R^{(k)} \cdots R^{(0)} \\ ~~~=~~~~ \lt V^{(k+1)\,H} = Q^{(k+1)\,H} V^{(k)\,H} \gt \\ V^{(k+1)} ( A^{(k+1)} - \mu_k I ) Q^{(k+1)\,H} R^{(k)} \cdots R^{(0)} \\ ~~~=~~~~ \lt \mbox{ algorithm } \gt \\ V^{(k+1)} R^{(k+1)} Q^{(k+1)} Q^{(k+1)\,H} R^{(k)} \cdots R^{(0)} \\ ~~~ = ~~~~ \lt Q^{(k+1)} Q^{(k+1)\,H} = I \gt \\ V^{(k+1)} R^{(k+1)} R^{(k)} \cdots R^{(0)} \end{array} \end{equation*}
By the Principle of Mathematical Induction, the result holds.

Homework 10.2.2.4.

Copy SimpleShiftedQRAlgConstantShift.m into SimpleShiftedQRAlg and modify it to implement an algorithm that executes the QR algorithm in Figure 10.2.2.1:

function [ Ak, V ] = SimpleShiftedQRAlg( A, maxits, illustrate, delay )

Modify the appropriate line in Assignments/Week10/matlab/test_simple_QR_algorithms.m, changing (0) to (1), and use it to examine the convergence of the method.

What do you observe?

Solution

Assignments/Week10/answers/SimpleShiftedQRAlg.m

Discuss what you observe online with others!