Skip to main content

Subsection 9.3.3 The Rayleigh quotient Iteration

A basic idea that allows one to accelerate the convergence of the inverse iteration is captured by the following lemma:

Homework 9.3.3.1.

Prove Lemma 9.3.3.1.

Solution

If \(A x = \lambda x \) then \(( A - \mu I ) x = A x - \mu x = \lambda x - \mu x = ( \lambda - \mu ) x \text{.}\)

The matrix \(A - \mu I \) is referred to as the matrix \(A \) that has been "shifted" by \(\mu \text{.}\) What the next lemma captures is that shifting \(A \) by \(\mu\) shifts the spectrum of \(A \) by \(\mu \text{:}\)

Homework 9.3.3.2.

Prove Lemma 9.3.3.2.

Solution
\begin{equation*} A - \mu I = X \Lambda X^{-1} - \mu X X^{-1} = X ( \Lambda - \mu I ) X^{-1} . \end{equation*}

This suggests the following (naive) iteration: Pick a value \(\mu \) close to \(\lambda_{m-1} \text{.}\) Iterate

\begin{equation*} \begin{array}{l} {\bf for~} k = 0, \ldots\\ ~~~ v^{(k+1)} = ( A - \mu I )^{-1} v^{(k)} \\ ~~~ v^{(k+1)} = ( \lambda_{m-1} - \mu ) v^{(k+1)} \\ {\bf endfor} \end{array} \end{equation*}

Of course one would solve \(( A - \mu I ) v^{(k+1)} = v^{(k)} \) rather than computing and applying the inverse of \(A \text{.}\)

If we index the eigenvalues so that

\begin{equation*} \vert \lambda_0 - \mu \vert \leq \cdots \leq \vert \lambda_{m-2} - \mu \vert \lt \vert \lambda_{m-1} - \mu \vert \end{equation*}

then

\begin{equation*} \begin{array}{l} \| v^{(k+1)} - \psi_{m-1} x_{m-1} \|_{X^{-1}} \leq \left\vert \frac{\lambda_{m-1}- \mu}{\lambda_{m-2} - \mu} \right\vert \| v^{(k)} - \psi_{m-1} x_{m-1} \|_{X^{-1}}. \end{array} \end{equation*}

The closer to \(\lambda_{m-1} \) the shift \(\mu \) is chosen, the more favorable the ratio (constant) that dictates the linear convergence of this modified Inverse Power Method.

A more practical algorithm is given by

\begin{equation*} \begin{array}{l} {\bf for~} k = 0, \ldots \\ ~~~ v^{(k+1)} = ( A - \mu I )^{-1} v^{(k)} \\ ~~~ v^{(k+1)} = v^{(k+1)} / \| v^{(k+1)} \| \\ {\bf endfor} \end{array} \end{equation*}

where instead of multiplying by the inverse one would want to solve the linear system \(( A - \mu I ) v^{(k+1)} = v^{(k)}\) instead.

The question now becomes how to chose \(\mu \) so that it is a good guess for \(\lambda_{m-1} \text{.}\) Often an application inherently supplies a reasonable approximation for the smallest eigenvalue or an eigenvalue of particular interest. Alternatively, we know that eventually \(v^{(k)} \) becomes a good approximation for \(x_{m-1} \) and therefore the Rayleigh quotient gives us a way to find a good approximation for \(\lambda_{m-1} \text{.}\) This suggests the (naive) Rayleigh-quotient iteration:

\begin{equation*} \begin{array}{l} {\bf for~} k = 0, \ldots \\ ~~~ \mu_k = v^{(k)\,H} A v^{(k)} / (v^{(k)\,H} v^{(k)}) \\ ~~~ v^{(k+1)} = ( A - \mu_k I )^{-1} v^{(k)} \\ ~~~ v^{(k+1)} = ( \lambda_{m-1} -\mu_k ) v^{(k+1)} \\ {\bf endfor} \end{array} \end{equation*}

Here \(\lambda_{m-1} \) is the eigenvalue to which the method eventually converges.

\begin{equation*} \begin{array}{l} \| v^{(k+1)} - \psi_{m-1} x_{m-1} \|_{X^{-1}} \leq \left\vert \frac{\lambda_{m-1}- \mu_k}{\lambda_{m-2} - \mu_k} \right\vert \| v^{(k)} - \psi_{m-1} x_{m-1} \|_{X^{-1}} \end{array} \end{equation*}

with

\begin{equation*} \lim_{k\rightarrow \infty} (\lambda_{m-1}- \mu_k) = 0 \end{equation*}

which means superlinear convergence is observed. In fact, it can be shown that once \(k \) is large enough

\begin{equation*} \begin{array}{l} \| v^{(k+1)} - \psi_{m-1} x_{m-1} \|_{X^{-1}} \leq C \| v^{(k)} - \psi_{m-1} x_{m-1} \|_{X^{-1}}^2, \end{array} \end{equation*}

thus achieving quadratic convergence. Roughly speaking this means that every iteration doubles the number of correct digits in the current approximation. To prove this, one shows that \(\vert \lambda_{m-1} - \mu_k \vert \leq C \| v^{(k)} - \psi_{m-1} x_{m-1} \|_{X^{-1}} \text{.}\)

Better yet, it can be shown that if \(A \) is Hermitian, then, once \(k \) is large enough,

\begin{equation*} \begin{array}{l} \| v^{(k+1)} - \psi_{m-1} x_{m-1} \|_{X^{-1}} \leq C \| v^{(k)} - \psi_{m-1} x_{m-1} \|_{X^{-1}}^3, \end{array} \end{equation*}

and hence the naive Rayleigh quotient iteration achieves cubic convergence for Hermitian matrices. Roughly speaking this means that every iteration triples the number of correct digits in the current approximation. This is mind-boggling fast convergence!

A practical Rayleigh quotient iteration is given by

\begin{equation*} \begin{array}{l} v^{(0)} = v^{(0)} / \| v^{(0)} \|_2 \\ {\bf for~} k = 0, \ldots \\ ~~~\mu_k = v^{(k)\,H} A v^{(k)} ~~~~~~ \mbox{(Now } \| v^{(k)} \|_2 = 1 \mbox{)} \\ ~~~ v^{(k+1)} = ( A - \mu_k I )^{-1} v^{(k)} \\ ~~~ v^{(k+1)} = v^{(k+1)} / \| v^{(k+1)} \| \\ {\bf endfor} \end{array} \end{equation*}