ALAFF Subspace iteration with a Hermitian matrix

Subsection 10.1.1 Subspace iteration with a Hermitian matrix

The idea behind subspace iteration is to perform the Power Method with more than one vector in order to converge to (a subspace spanned by) the eigenvectors associated with a set of eigenvalues.

We continue our discussion by restricting ourselves to the case where \(A \in \Cmxm\) is Hermitian. Why? Because the eigenvectors associated with distinct eigenvalues of a Hermitian matrix are mutually orthogonal (and can be chosen to be orthonormal), which will simplify our discussion. Here we repeat the Power Method:

\begin{equation*} \begin{array}{lll} v_0 := \mbox{ random vector} \\ v_0^{(0)} := v_0 / \| v_0 \|_2 \amp \amp \mbox{ normalize to have length one } \\ {\bf for~} k:=0, \ldots \\ ~~~ v_0 := A v_0^{(k)} \\ ~~~ v_0^{(k+1)} := v_0 / \| v_0 \|_2 \amp \amp \mbox{ normalize to have length one } \\ {\bf endfor} \end{array} \end{equation*}

In previous discussion, we used \(v^{(k)} \) for the current approximation to the eigenvector. We now add the subscript to it, \(v_0^{(k)} \text{,}\) because we will shortly start iterating with multiple vectors.

Homework 10.1.1.1.

You may want to start by executing git pull to update your directory Assignments.

Examine Assignments/Week10/matlab/PowerMethod.m which implements

[ lambda_0, v0 ] = PowerMethod( A, x, maxiters, illustrate, delay )

This routine implements the Power Method, starting with a vector x for a maximum number of iterations maxiters or until convergence, whichever comes first. To test it, execute the script in Assignments/Week10/matlab/test_SubspaceIteration.m which uses the Power Method to compute the largest eigenvalue (in magnitude) and corresponding eigenvector for an \(m \times m \) Hermitian matrix \(A \) with eigenvalues \(1, \ldots, m \text{.}\)

Be sure to click on "Figure 1" to see the graph that is created.

Solution

Watch the video regarding this problem on YouTube: https://youtu.be/8Bgf1tJeMmg. (embedding a video in a solution seems to cause PreTeXt trouble...)

Recall that when we analyzed the convergence of the Power Method, we commented on the fact that the method converges to an eigenvector associated with the largest eigenvalue (in magnitude) if two conditions are met:

\(\vert \lambda_0 \vert \gt \vert \lambda_1 \vert \text{.}\)
\(v_0^{(0)} \) has a component in the direction of the eigenvector, \(x_0 \text{,}\) associated with \(\lambda_0 \text{.}\)

A second initial vector, \(v_1^{(0)} \text{,}\) does not have a component in the direction of \(x_0 \) if it is orthogonal to \(x_0 \text{.}\) So, if we know \(x_0 \text{,}\) then we can pick a random vector, subtract out the component in the direction of \(x_0 \text{,}\) and make this our vector \(v_1^{(0)} \) with which we should be able to execute the Power Method to find an eigenvector, \(x_1 \text{,}\) associated with the eigenvalue that has the second largest magnitude, \(\lambda_1 \) . If we then start the Power Method with this new vector (and don't introduce roundoff error in a way that introduces a component in the direction of \(x_0 \)), then the iteration will home in on a vector associated with \(\lambda_1 \) (provided \(A \) is Hermitian, \(\vert \lambda_1 \vert \gt \vert \lambda_2 \vert \text{,}\) and \(v_1^{(0)} \) has a component in the direction of \(x_1 \text{.}\)) This iteration would look like

\begin{equation*} \begin{array}{lll} x_0 := x_0 / \| x_0 \|_2 \amp \mbox{ } \amp \mbox{ normalize known eigenvector } x_0 \mbox{ to have length one } \\ v_1 := \mbox{ random vector} \\ v_1 := v_1 - x_0^Hv_1 x_0 \amp \amp \mbox{ make sure the vector is orthogonal to } x_0 \\ v_1^{(0)} := v_1 / \| v_1 \|_2 \amp \amp \mbox{ normalize to have length one } \\ {\bf for~} k:=0, \ldots \\ ~~~ v_1 := A v_1^{(k)} \\ ~~~ v_1^{(k+1)} := v_1 / \| v_1 \|_2 \amp \amp \mbox{ normalize to have length one } \\ {\bf endfor} \end{array} \end{equation*}

Homework 10.1.1.2.

Copy Assignments/Week10/matlab/PowerMethod.m into PowerMethodLambda1.m. Modify it by adding an input parameter x0, which is an eigenvector associated with \(\lambda_0 \) (the eigenvalue with largest magnitude).

[ lambda_1, v1 ] = PowerMethodLambda1( A, x, x0, maxiters, illustrate, delay )

The new function should subtract this vector from the initial random vector as in the above algorithm.

Modify the appropriate line in Assignments/Week10/matlab/test_SubspaceIteration.m, changing (0) to (1), and use it to examine the convergence of the method.

What do you observe?

Solution

Assignments/Week10/answers/PowerMethodLambda1.m

Watch the video regarding this problem on YouTube: https://youtu.be/48HnBJmQhX8. (embedding a video in a solution seems to cause PreTeXt trouble...)

Because we should be concerned about the introduction of a component in the direction of \(x_0 \) due to roundoff error, we may want to reorthogonalize with respect to \(x_0 \) in each iteration:

\begin{equation*} \begin{array}{lll} x_0 := x_0 / \| x_0 \|_2 \amp \mbox{ } \amp \mbox{ normalize known eigenvector } x_0 \mbox{ to have length one } \\ v_1 := \mbox{ random vector} \\ v_1 := v_1 - x_0^Hv_1 x_0 \amp \amp \mbox{ make sure the vector is orthogonal to } x_0 \\ v_1^{(0)} := v_1 / \| v_1 \|_2 \amp \amp \mbox{ normalize to have length one } \\ {\bf for~} k:=0, \ldots \\ ~~~ v_1 := A v_1^{(k)} \\ ~~~ v_1 := v_1 - x_0^Hv_1 x_0 \amp \amp \mbox{ make sure the vector is orthogonal to } x_0 \\ ~~~ v_1^{(k+1)} := v_1 / \| v_1 \|_2 \amp \amp \mbox{ normalize to have length one } \\ {\bf endfor} \end{array} \end{equation*}

Homework 10.1.1.3.

Copy PowerMethodLambda1.m into PowerMethodLambda1Reorth.m and modify it to reorthogonalize with respect to x0:

[ lambda_1, v1 ] = PowerMethodLambda1Reorth( A, x, v0, maxiters, illustrate, delay  );

Modify the appropriate line in Assignments/Week10/matlab/test_SubspaceIteration.m, changing (0) to (1), and use it to examine the convergence of the method.

What do you observe?

Solution

Assignments/Week10/answers/PowerMethodLambda1Reorth.m

Watch the video regarding this problem on YouTube: https://youtu.be/YmZc2oq02kA. (embedding a video in a solution seems to cause PreTeXt trouble...)

We now observe that the steps that normalize \(x_0 \) to have unit length and then subtract out the component of \(v_1 \) in the direction of \(x_0 \text{,}\) normalizing the result, are exactly those performed by the Gram-Schmidt process. More generally, it is is equivalent to computing the QR factorization of the matrix \(\left( \begin{array}{c|c} x_0 \amp v_1 \end{array} \right).\) This suggests the algorithm

\begin{equation*} \begin{array}{l} v_1 := \mbox{ random vector} \\ ( \left( \begin{array}{c | c} x_0 \amp v_1^{(0)} \end{array} ), R \right) := {\rm QR}( \left( \begin{array}{c | c} x_0 \amp v_1 \end{array} \right) ) \\ {\bf for~} k:=0, \ldots \\ ~~~( \left( \begin{array}{c | c} x_0 \amp v_1^{(k+1)} \end{array} \right), R ) := {\rm QR}( \left( \begin{array}{c | c} x_0 \amp A v_1^{(k)} \end{array} \right) ) \\ {\bf endfor} \end{array} \end{equation*}

Obviously, this redundantly normalizes \(x_0 \text{.}\) It puts us on the path of a practical algorithm for computing the eigenvectors associated with \(\lambda_0 \) and \(\lambda_1 \text{.}\)

The problem is that we typically don't know \(x_0 \) up front. Rather than first using the power method to compute it, we can instead iterate with two random vectors, where the first converges to a vector associated with \(\lambda_0 \) and the second to one associated with \(\lambda_1 \text{:}\)

\begin{equation*} \begin{array}{l} v_0 := \mbox{ random vector} \\ v_1 := \mbox{ random vector} \\ ( \left( \begin{array}{c | c} v_0^{(0)}\amp v_1^{(0)} \end{array} ), R \right) := {\rm QR}( \left( \begin{array}{c | c} v_0 \amp v_1 \end{array} \right) ) \\ {\bf for~} k:=0, \ldots \\ ~~~( \left( \begin{array}{c | c} v_0^{(k+1)} \amp v_1^{(k+1)} \end{array} \right), R ) := {\rm QR}( A \left( \begin{array}{c | c} v_0^{(k)} \amp v_1^{(k)} \end{array} \right) ) \\ {\bf endfor} \end{array} \end{equation*}

We observe:

If \(\vert \lambda_0 \vert \gt \vert \lambda_1 \vert \text{,}\) the vectors \(v_0^{(k)} \) will converge linearly to a vector in the direction of \(x_0 \) at a rate dictated by the ratio \(\vert \lambda_1 \vert / \vert \lambda_0 \vert \text{.}\)
If \(\vert \lambda_0 \vert \gt \vert \lambda_1 \vert \gt \vert \lambda_2 \vert, \text{,}\) the vectors \(v_1^{(k)} \) will converge linearly to a vector in the direction of \(x_1 \) at a rate dictated by the ratio \(\vert \lambda_2 \vert / \vert \lambda_1 \vert \text{.}\)
If \(\vert \lambda_0 \vert \geq \vert \lambda_1 \vert \gt \vert \lambda_2 \vert\) then \(\Span( \{ v_0^{(k)}, v_1^{(k)} \} ) \) will eventually start approximating the subspace \(\Span( \{ x_0, x_1 \} ) \text{.}\)

What we have described is a special case of subspace iteration. The associated eigenvalue can be approximated via the Rayleigh quotient:

\begin{equation*} \lambda_0 \approx \lambda_0^{(k)} = {v_0^{(k)}}^H A v_0^{(k)} \mbox{ and } \lambda_1 \approx \lambda_1^{(k)} = {v_1^{(k)}}^H A v_1^{(k)} \end{equation*}

Alternatively,

\begin{equation*} A^{(k)} = \left( \begin{array}{c| c} v_0^{(k)} \amp v_1^{(k)} \end{array} \right)^H A \left( \begin{array}{c| c} v_0^{(k)} \amp v_1^{(k)} \end{array} \right) \mbox{ converges to } \left( \begin{array}{c| c} \lambda_0 \amp 0 \\ \hline 0 \amp \lambda_1 \end{array} \right) \end{equation*}

if \(A \) is Hermitian, \(\vert \lambda_1 \vert \gt \vert \lambda_2 \vert \text{,}\) and \(v^{(0)} \) and \(v^{(1)} \) have components in the directions of \(x_0 \) and \(x_1 \text{,}\) respectively.

The natural extention of these observations is to iterate with \(n \) vectors:

\begin{equation*} \begin{array}{l} \widehat V := \mbox{ random } m \times n \mbox{ matrix} \\ ( \widehat V^{(0)}, R ) := {\rm QR}( \widehat V ) \\ A^{(0)} = { V^{(0)}}^H A V^{(0)} \\ {\bf for~} k:=0, \ldots \\ ~~~( \widehat V^{(k+1)}, R ) := {\rm QR}( A \widehat V^{(k)} ) \\ ~~~ A^{(k+1)} = {\widehat V^{(k+1)}~}^H A \widehat V^{(k+1)} \\ {\bf endfor} \end{array} \end{equation*}

By extending the reasoning given so far in this unit, if

\(A \) is Hermitian,
\(\vert \lambda_0 \vert \gt \vert \lambda_1 \vert \gt \cdots \gt \vert \lambda_{n-1} \vert \gt \vert \lambda_{n} \vert \text{,}\) and
each \(v_j \) has a component in the direction of \(x_j \text{,}\) an eigenvector associated with \(\lambda_j \text{,}\)

then each \(v_j^{(j)} \) will converge to a vector in the direction \(x_j \text{.}\) The rate with which the component in the direction of \(x_p \text{,}\) \(0 \leq p \lt n \text{,}\) is removed from \(v_j^{(k)} \text{,}\) \(n \leq j \lt m \text{,}\) is dictated by the ratio \(\vert \lambda_p \vert / \vert \lambda_j \vert \text{.}\)

If some of the eigenvalues have equal magnitude, then the corresponding columns of \(\widehat V^{(k)} \) will eventually form a basis for the subspace spanned by the eigenvectors associated with those eigenvalues.

Homework 10.1.1.4.

Copy PowerMethodLambda1Reorth.m into SubspaceIteration.m and modify it to work with an \(m \times n \) matrix \(V \text{:}\)

[ Lambda, V ] = SubspaceIteration( A, V, maxiters, illustrate, delay  );

Modify the appropriate line in Assignments/Week10/matlab/test_SubspaceIteration.m, changing (0) to (1), and use it to examine the convergence of the method.

What do you observe?

Solution

Assignments/Week10/answers/SubspaceIteration.m

Watch the video regarding this problem on YouTube: https://youtu.be/Er7jGYs0HbE. (embedding a video in a solution seems to cause PreTeXt trouble...)