ALAFF A simple QR algorithm

Subsection 10.2.1 A simple QR algorithm

We now morph the subspace iteration discussed in the last unit into a simple incarnation of an algorithm known as the QR algorithm. We will relate this algorithm to performing subspace iteration with an \(m \times m \) (square) matrix so that the method finds all eigenvectors simultaneously (under mild conditions). Rather than starting with a random matrix \(V \text{,}\) we now start with the identity matrix. This yields the algorithm on the left in Figure 10.2.1.1. We contrast it with the algorithm on the right.

\begin{equation*} \begin{array}{l} \widehat V := I \\ {\bf for~} k:=0, \ldots \\ ~~~( \widehat V, \widehat R ) := {\rm QR}( A \widehat V ) \\[0.1in] ~~~\widehat A = \widehat V^H A \widehat V \\ {\bf endfor} \\ % \widehat \Lambda = \widehat V^H A \widehat V \\ \end{array} \end{equation*}

\begin{equation*} \begin{array}{l} V = I \\ {\bf for~} k:=0, \ldots \\ ~~~( Q, R ) := {\rm QR}( A ) \\ ~~~ A = R Q \\ ~~~ V = V Q \\ {\bf endfor} \\ \end{array} \end{equation*}

Figure 10.2.1.1. Left: Subspace iteration started with \(\widehat V = I \text{.}\) Right: Simple QR algorithm.

The magic lies in the fact that the matrices computed by the QR algorithm are identical to those computed by the subspace iteration: Upon completion \(\widehat V = V \) and the matrix \(\widehat A \) on the left equals the (updated) matrix \(A \) on the right. To be able to prove this, we annotate the algorithm so we can reason about the contents of the matrices for each iteration.

\begin{equation*} \begin{array}{l} \widehat A^{(0)} := A \\ \widehat V^{(0)} := I \\ \widehat R^{(0)} := I \\ {\bf for~} k:=0, \ldots \\ ~~~( \widehat V^{(k+1)}, \widehat R^{(k+1)} ) := {\rm QR}( A \widehat V^{(k)} ) \\ ~~~ \widehat A^{(k+1)} := { \widehat V^{(k+1)}~}^H A \widehat V^{(k+1)} \\ {\bf endfor} \end{array} \end{equation*}

\begin{equation*} \begin{array}{l} A^{(0)} := A \phantom{\widehat A^{(0)}}\\ V^{(0)} := I \phantom{\widehat V^{(0)}}\\ R^{(0)} := I \phantom{\widehat R^{(0)}}\\ {\bf for~} k:=0, \ldots \\ ~~~( Q^{(k+1)}, R^{(k+1)} ) := {\rm QR}( A^{(k)} ) \\ ~~~ A^{(k+1)} := R^{(k+1)} Q^{(k+1)} \\ ~~~ V^{(k+1)} := V^{(k)} Q^{(k+1)} \\ {\bf endfor} \end{array} \end{equation*}

Let's start by showing how the QR algorithm applies unitary equivalence transformations to the matrices \(A^{(k)} \text{.}\)

Homework 10.2.1.1.

Show that for the algorithm on the right \(A^{(k+1)} = {Q^{(k+1)}~}^H A^{(k)} Q^{(k+1)} \text{.}\)

Solution

The algorithm computes the QR factorization of \(A^{(k)} \)

\begin{equation*} A^{(k)} = Q^{(k+1)} R^{(k+1)} \end{equation*}

after which

\begin{equation*} A^{(k+1)} := R^{(k+1)} Q^{(k+1)} \end{equation*}

Hence

\begin{equation*} A^{(k+1)} = R^{(k+1)} Q^{(k+1)} = {Q^{(k+1)}~}^H A^{(k)} Q^{(k+1)}. \end{equation*}

This last homework shows that \(A^{(k+1)} \) is derived from \(A^{(k)} \) via a unitary similarity transformation and hence has the same eigenvalues as does \(A^{(k)} \text{.}\) This means it also is derived from \(A\) via a (sequence of) unitary similarity transformation and hence has the same eigenvalues as does \(A \text{.}\)

We now prove these algorithms mathematically equivalent.

Homework 10.2.1.2.

In the above algorithms, for all \(k \text{,}\)

\(\widehat A^{(k)} = A^{(k)}\text{.}\)
\(\widehat R^{(k)} = R^{(k)}\text{.}\)
\(\widehat V^{(k)} = V^{(k)} \text{.}\)

Hint

The QR factorization is unique, provided the diagonal elements of \(R \) are taken to be positive.

Solution

We will employ a proof by induction.

Base case: \(k = 0 \)

This is trivially true:
- \(\widehat A^{(0)} = A = A^{(0)} \text{.}\)
- \(\widehat R^{(0)} = I = R^{(0)} \text{.}\)
- \(\widehat V^{(0)} = I = V^{(0)} \text{.}\)
Inductive step: Assume that \(\widehat A^{(k)} = A^{(k)}\text{,}\) \(\widehat R^{(k)} = R^{(k)}\text{,}\) and \(\widehat V^{(k)} = V^{(k)} \text{.}\) Show that \(\widehat A^{(k+1)} = A^{(k+1)} \text{,}\) \(\widehat R^{(k+1)} = R^{(k+1)}\text{,}\) and \(\widehat V^{(k+1)} = V^{(k+1)}\text{.}\)

From the algorithm on the left, we know that

\begin{equation*} A \widehat V^{(k)} = \widehat V^{(k+1)} \widehat R^{(k+1)}. \end{equation*}

and

\begin{equation} \begin{array}{l} A^{(k)} \\ ~~~=~~~~ \lt (I.H.) \gt \\ \widehat A^{(k)} \\ ~~~=~~~~ \lt \mbox{ algorithm on left }\gt \\ \widehat V^{(k)\,H} A \widehat V^{(k)} \\ ~~~=~~~~ \lt \mbox{ algorithm on left } \gt \\ \widehat V^{(k)\,H} \widehat V^{(k+1)}\widehat R^{(k+1)} \\ ~~~=~~~~ \lt \mbox{ I.H. } \gt \\ V^{(k)\,H} \widehat V^{(k+1)}\widehat R^{(k+1)}. \end{array}\label{chapter10-qr-eqn-1}\tag{10.2.1} \end{equation}

But from the algorithm on the right, we know that

\begin{equation} A^{(k)} = Q^{(k+1)} R^{(k+1)}.\label{chapter10-qr-eqn-2}\tag{10.2.2} \end{equation}

Both (10.2.1) and (10.2.2) are QR factorizations of \(A^{(k)} \) and hence, by the uniqueness of the QR factorization,

\begin{equation*} \widehat R^{(k+1)} = R^{(k+1)} \end{equation*}

and

\begin{equation*} Q^{(k+1)} = V^{(k)\,H} \widehat V^{(k+1)} \end{equation*}

or, equivalently and from the algorithm on the right,

\begin{equation*} \begin{array}[t]{c} \underbrace{ V^{(k)} Q^{(k+1)}} \\ V^{(k+1)} \end{array} = \widehat V^{(k+1)}. \end{equation*}

This shows that
- \(\widehat R^{(k+1)} = R^{(k+1)}\) and
- \(\widehat V^{(k+1)} = V^{(k+1)} \text{.}\)
Also,

\begin{equation*} \begin{array}{l} \widehat A^{(k+1)}\\ ~~~=~~~~ \lt \mbox{ algorithm on left } \gt \\ \widehat V^{(k+1)\,H} A \widehat V^{(k+1)} \\ ~~~=~~~~ \lt \widehat V^{(k+1)} = V^{(k+1)} \gt \\ V^{(k+1)\,H} A V^{(k+1)} \\ ~~~=~~~~ \lt \mbox{ algorithm on right } \gt \\ Q^{(k+1)\,H} V^{(k)\,H} A V^{(k)} Q^{(k+1)} \\ ~~~=~~~~ \lt \mbox{ I.H. } \gt \\ Q^{(k+1)\,H} \widehat V^{(k)\,H} A \widehat V^{(k)} Q^{(k+1)} \\ ~~~=~~~~ \lt \mbox{ algorithm on left } \gt \\ Q^{(k+1)\,H} \widehat A^{(k)} Q^{(k+1)} \\ ~~~=~~~~ \lt \mbox{ I.H. } \gt \\ Q^{(k+1)\,H} A^{(k)} Q^{(k+1)} \\ ~~~=~~~~ \lt \mbox{ last homework} \gt \\ A^{(k+1)}. \end{array} \end{equation*}
By the Principle of Mathematical Induction, the result holds.

Homework 10.2.1.3.

In the above algorithms, show that for all \(k \)

\(V^{(k)} = Q^{(0)} Q^{(1)} \cdots Q^{(k)}\text{.}\)
\(A^{k} = V^{(k)} R^{(k)} \cdots R^{(1)} R^{(0)} \text{.}\) (Note: \(A^{k} \) here denotes \(A \) raised to the \(k \)th power.)

Assume that \(Q^{(0)} = I \text{.}\)

Solution

We will employ a proof by induction.

Base case: \(k = 0 \)

\(\begin{array}{c} \underbrace{A^0}\\ I \end{array} = \begin{array}{c} \underbrace{V^{(0)}}\\ I \end{array} A^0 = \begin{array}{c} \underbrace{ R^{(0}) } \\ I \end{array} \text{.}\)
Inductive step: Assume that \(V^{(k)} = Q^{(0)} \cdots Q^{(k)}\) and \(A^{k} = V^{(k)} R^{(k)} \cdots R^{(0)} \text{.}\) Show that \(V^{(k+1)} = Q^{(0)} \cdots Q^{(k+1)}\) and \(A^{k+1} = V^{(k+1)} R^{(k+1)} \cdots R^{(0)} \text{.}\)

\begin{equation*} V^{(k+1)} = V^{(k)} Q^{(k+1)} = Q^{(0)} \cdots Q^{(k)} Q^{(k+1)}. \end{equation*}

by the inductive hypothesis.

Also,

\begin{equation*} \begin{array}{l} A^{k+1} \\ ~~~=~~~~ \lt \mbox{ definition } \gt \\ A A^{k} \\ ~~~=~~~~ \lt \mbox{ inductive hypothesis } \gt \\ A V^{(k)} R^{(k)} \cdots R^{(0)} \\ ~~~=~~~~ \lt \mbox{ inductive hypothesis } \gt \\ A \widehat V^{(k)} R^{(k)} \cdots R^{(0)} \\ ~~~=~~~~ \lt \mbox{ left algorithm } \gt \\ \widehat V^{(k+1)} \widehat R^{(k+1)} R^{(k)}\cdots R^{(0)} \\ ~~~=~~~~ \lt V^{(k+1)} = \widehat V^{(k+1)}; R^{(k+1)} = \widehat R^{(k+1)} \gt \\ V^{(k+1)} R^{(k+1)} R^{(k)}\cdots R^{(0)} . \end{array} \end{equation*}
By the Principle of Mathematical Induction, the result holds for all \(k \text{.}\)

This last exercise shows that

\begin{equation*} A^{k} = \begin{array}[t]{c} \underbrace{ Q^{(0)} Q^{(1)} \cdots Q^{(k)} } \\ \mbox{unitary } V^{(k)} \end{array} \begin{array}[t]{c} \underbrace{ R^{(k)} \cdots R^{(1)} R^{(0)} } \\ \mbox{upper triangular } \widetilde R^{(k)} \end{array} \end{equation*}

which exposes a QR factorization of \(A^{k} \text{.}\) Partitioning \(V^{(k)} \) by columns

\begin{equation*} V^{(k)} = \left( \begin{array}{c | c | c} v_0^{(k)} \amp \cdots \amp v_{m-1}^{(k)} \end{array} \right) \end{equation*}

we notice that applying \(k \) iterations of the Power Method to vector \(e_0 \) yields

\begin{equation*} A^k e_0 = V^{(k)} \widetilde R^{(k)} e_0 = V^{(k)} \widetilde \rho_{0,0}^{(k)} e_0 = \widetilde \rho_{0,0}^{(k)} V^{(k)} e_0 = \widetilde \rho_{0,0}^{(k)} v_0^{(k)} , \end{equation*}

where \(\widetilde \rho_{0,0}^{(k)} \) is the \((0,0) \) entry in matrix \(\widetilde R^{(k)} \text{.}\) Thus, the first column of \(V^{(k)} \) equals a vector that would result from \(k \) iterations of the Power Method. Similarly, the second column of \(V^{(k)} \) equals a vector that would result from \(k \) iterations of the Power Method, but orthogonal to \(v_0^{(k)} \text{.}\)

We make some final observations:

\(A^{(k+1)} = Q^{(k)\,H} A^{(k)} Q^{(k)} \text{.}\) This means we can think of \(A^{(k+1)} \) as the matrix \(A^{(k)} \) but viewed in a new basis (namely the basis that consists of the column of \(Q^{(k)} \)).
\(A^{(k+1)} = ( Q^{(0)} \cdots Q^{(k)})^H A Q^{(0)} \cdots Q^{(k)} = V^{(k)\,H} A V^{(k)} \text{.}\) This means we can think of \(A^{(k+1)} \) as the matrix \(A \) but viewed in a new basis (namely the basis that consists of the column of \(V^{(k)} \)).
In each step, we compute

\begin{equation*} ( Q^{(k+1)}, R^{(k+1)} ) = QR( A^{(k)} ) \end{equation*}

which we can think of as

\begin{equation*} ( Q^{(k+1)}, R^{(k+1)} )= QR( A^{(k)} \times I ) . \end{equation*}

This suggests that in each iteration we perform one step of subspace iteration, but with matrix \(A^{(k)} \) and \(V = I \text{:}\)

\begin{equation*} ( Q^{(k+1)}, R^{(k+1)} ) = QR( A^{(k)} V ) . \end{equation*}
The insight is that the QR algorithm is identical to subspace iteration, except that at each step we reorient the problem (express it in a new basis) and we restart it with \(V = I \text{.}\)

Homework 10.2.1.4.

Examine Assignments/Week10/matlab/SubspaceIterationAllVectors.m, which implements the subspace iteration in Figure 10.2.1.1 (left). Examine it by executing the script in Assignments/Week10/matlab/test_simple_QR_algorithm.m.

Solution

Discuss what you observe online with others!

Homework 10.2.1.5.

Copy Assignments/Week10/matlab/SubspaceIterationAllVectors.m into SimpleQRAlg.m and modify it to implement the algorithm in Figure 10.2.1.1 (right) as

function [ Ak, V ] = SimpleQRAlg( A, maxits, illustrate, delay )

Modify the appropriate line in Assignments/Week10/matlab/test_simple_QR_algorithms.m, changing (0) to (1), and use it to examine the convergence of the method.

What do you observe?

Solution

Assignments/Week10/answers/SimpleQRAlg.m

Discuss what you observe online with others!