## Unit9.3.4The Rayleigh Quotient Iteration

A basic idea that allows one to accelerate the convergence of the inverse iteration is captured by the following exercise:

###### Homework9.3.4.1.

Let $A \in \mathbb C^{m \times m} \text{,}$ $\rho \in \mathbb C \text{,}$ and $( \lambda, x )$ an eigenpair of $A$ .

Which of the follow is an eigenpair of the shifted matrix $A - \rho I \text{:}$

• $( \lambda, x ) \text{.}$

• $( \lambda, A^{-1} x ) \text{.}$

• $( \lambda - \rho, x ) \text{.}$

• $( 1/(\lambda - \rho), x ) \text{.}$

$( \lambda - \rho, x ) \text{.}$

Solution

Let $A x = \lambda x \text{.}$ Then

\begin{equation*} ( A - \rho I ) x = A x - \rho x = \lambda x - \rho x = ( \lambda - \rho ) x. \end{equation*}

We conclude that $( \lambda- \rho, x )$ is an eigenpair of $A-\rho I \text{.}$

The matrix $A - \rho I$ is referred to as the matrix $A$ that has been "shifted" by $\rho \text{.}$ What the next lemma captures is that shifting $A$ by $\rho$ shifts the spectrum of $A$ by $\rho \text{:}$

###### Homework9.3.4.2.

Prove Lemma 9.3.4.1.

Solution
\begin{equation*} A - \rho I = X \Lambda X^{-1} - \rho X X^{-1} = X ( \Lambda - \rho I ) X^{-1} . \end{equation*}

This suggests the following (naive) iteration: Pick a value $\rho$ close to $\lambda_{m-1} \text{.}$ Iterate

\begin{equation*} \begin{array}{l} \mbox{Pick } v^{(0)} \mbox{ of unit length} \\ {\bf for~} k = 0, \ldots\\ ~~~ v^{(k+1)} = ( A - \rho I )^{-1} v^{(k)} \\ ~~~ v^{(k+1)} = ( \lambda_{m-1} - \rho ) v^{(k+1)} \\ {\bf endfor} \end{array} \end{equation*}

Of course one would solve $( A - \rho I ) v^{(k+1)} = v^{(k)}$ rather than computing and applying the inverse of $A \text{.}$

If we index the eigenvalues so that

\begin{equation*} \vert \lambda_{m-1} - \rho \vert \lt \vert \lambda_{m-2} - \rho \vert \leq \cdots \leq \vert \lambda_{0} - \rho \vert \end{equation*}

then

\begin{equation*} \begin{array}{l} \| v^{(k+1)} - \psi_{m-1} x_{m-1} \|_{X^{-1}} \leq \left\vert \frac{\lambda_{m-1}- \rho}{\lambda_{m-2} - \rho} \right\vert \| v^{(k)} - \psi_{m-1} x_{m-1} \|_{X^{-1}}. \end{array} \end{equation*}

The closer to $\lambda_{m-1}$ the shift $\rho$ is chosen, the more favorable the ratio (constant) that dictates the linear convergence of this modified Inverse Power Method.

A more practical algorithm is given by

\begin{equation*} \begin{array}{l} \mbox{Pick } v^{(0)} \mbox{ of unit length} \\ {\bf for~} k = 0, \ldots \\ ~~~ v^{(k+1)} = ( A - \rho I )^{-1} v^{(k)} \\ ~~~ v^{(k+1)} = v^{(k+1)} / \| v^{(k+1)} \| \\ {\bf endfor} \end{array} \end{equation*}

where instead of multiplying by the inverse one would want to solve the linear system $( A - \rho I ) v^{(k+1)} = v^{(k)}$ instead.

The question now becomes how to chose $\rho$ so that it is a good guess for $\lambda_{m-1} \text{.}$ Often an application inherently supplies a reasonable approximation for the smallest eigenvalue or an eigenvalue of particular interest. Alternatively, we know that eventually $v^{(k)}$ becomes a good approximation for $x_{m-1}$ and therefore the Rayleigh quotient gives us a way to find a good approximation for $\lambda_{m-1} \text{.}$ This suggests the (naive) Rayleigh-quotient iteration:

\begin{equation*} \begin{array}{l} \mbox{Pick } v^{(0)} \mbox{ of unit length} \\ {\bf for~} k = 0, \ldots \\ ~~~ \rho_k = v^{(k)\,H} A v^{(k)} / (v^{(k)\,H} v^{(k)}) \\ ~~~ v^{(k+1)} = ( A - \rho_k I )^{-1} v^{(k)} \\ ~~~ v^{(k+1)} = ( \lambda_{m-1} -\rho_k ) v^{(k+1)} \\ {\bf endfor} \end{array} \end{equation*}

Here $\lambda_{m-1}$ is the eigenvalue to which the method eventually converges.

\begin{equation*} \begin{array}{l} \| v^{(k+1)} - \psi_{m-1} x_{m-1} \|_{X^{-1}} \leq \left\vert \frac{\lambda_{m-1}- \rho_k}{\lambda_{m-2} - \rho_k} \right\vert \| v^{(k)} - \psi_{m-1} x_{m-1} \|_{X^{-1}} \end{array} \end{equation*}

with

\begin{equation*} \lim_{k\rightarrow \infty} (\lambda_{m-1}- \rho_k) = 0 \end{equation*}

which means superlinear convergence is observed. In fact, it can be shown that once $k$ is large enough

\begin{equation*} \begin{array}{l} \| v^{(k+1)} - \psi_{m-1} x_{m-1} \|_{X^{-1}} \leq C \| v^{(k)} - \psi_{m-1} x_{m-1} \|_{X^{-1}}^2, \end{array} \end{equation*}

thus achieving quadratic convergence. Roughly speaking this means that every iteration doubles the number of correct digits in the current approximation. To prove this, one shows that $\vert \lambda_{m-1} - \rho_k \vert \leq K \| v^{(k)} - \psi_{m-1} x_{m-1} \|_{X^{-1}}$ for some constant $K \text{.}$ Details go beyond this discussion.

Better yet, it can be shown that if $A$ is Hermitian, then (once $k$ is large enough)

\begin{equation*} \begin{array}{l} \| v^{(k+1)} - \psi_{m-1} x_{m-1} \| \leq C \| v^{(k)} - \psi_{m-1} x_{m-1} \|^3 \end{array} \end{equation*}

for some constant $C$ and hence the naive Rayleigh Quotient Iteration achieves cubic convergence for Hermitian matrices. Here our norm $\| \cdot \|_{X^{-1}}$ becomes any p-norm since the Spectral Decomposition Theorem tells us that for Hermitian matrices $X$ can be taken to equal a unitary matrix. Roughly speaking this means that every iteration triples the number of correct digits in the current approximation. This is mind-boggling fast convergence!

A practical Rayleigh Quotient Iteration is given by

\begin{equation*} \begin{array}{l} v^{(0)} = v^{(0)} / \| v^{(0)} \|_2 \\ {\bf for~} k = 0, \ldots \\ ~~~\rho_k = v^{(k)\,H} A v^{(k)} ~~~~~~ \mbox{(Now } \| v^{(k)} \|_2 = 1 \mbox{)} \\ ~~~ v^{(k+1)} = ( A - \rho_k I )^{-1} v^{(k)} \\ ~~~ v^{(k+1)} = v^{(k+1)} / \| v^{(k+1)} \| \\ {\bf endfor} \end{array} \end{equation*}
###### Remark9.3.4.2.

A concern with the (Shifted) Inverse Power Method and Rayleigh Quotient Iteration is that the matrix with which one solves is likely nearly singular. It turns out that this actually helps: the error that is amplified most is in the direction of the eigenvector associated with the smallest eigenvalue (after shifting, if appropriate).