## Unit9.3.2The Power Method: Convergence

Before we make the algorithm practical, let us examine how fast the iteration converges. This requires a few definitions regarding rates of convergence.

###### Definition9.3.2.1.Convergence of a sequence of scalars.

Let $\alpha_0, \alpha_1 , \alpha_2, \ldots \in \R$ be an infinite sequence of scalars. Then $\alpha_k$ is said to converge to $\alpha$ if

\begin{equation*} \lim_{k \rightarrow \infty} \vert \alpha_k - \alpha \vert = 0 . \end{equation*}
###### Definition9.3.2.2.Convergence of a sequence of vectors.

Let $x_0, x_1 , x_2, \ldots \in \C^m$ be an infinite sequence of vectors. Then $x_k$ is said to converge to $x$ if for any norm $\| \cdot \|$

\begin{equation*} \lim_{k \rightarrow \infty} \| x_k - x \| = 0. \end{equation*}

Because of the equivalence of norms, discussed in Unit 1.2.6, if a sequence of vectors converges in one norm, then it converges in all norms.

###### Definition9.3.2.3.Rate of convergence.

Let $\alpha_0, \alpha_1 , \alpha_2, \ldots \in \R$ be an infinite sequence of scalars that converges to $\alpha \text{.}$ Then

• $\alpha_k$ is said to converge linearly to $\alpha$ if for sufficiently large $k$

\begin{equation*} \vert \alpha_{k+1} - \alpha \vert \leq C \vert \alpha_{k} - \alpha \vert \end{equation*}

for some constant $C \lt 1 \text{.}$ In other words, if

\begin{equation*} \lim_{k \rightarrow \infty} \frac {\vert \alpha_{k+1} - \alpha \vert} {\vert \alpha_{k} - \alpha \vert} = C \lt 1. \end{equation*}
• $\alpha_k$ is said to converge superlinearly to $\alpha$ if for sufficiently large $k$

\begin{equation*} \vert \alpha_{k+1} - \alpha \vert \leq C_k \vert \alpha_{k} - \alpha \vert \end{equation*}

with $C_k \rightarrow 0 \text{.}$ In other words, if

\begin{equation*} \lim_{k \rightarrow \infty} \frac {\vert \alpha_{k+1} - \alpha \vert} {\vert \alpha_{k} - \alpha \vert} = 0. \end{equation*}
• $\alpha_k$ is said to converge quadratically to $\alpha$ if for sufficiently large $k$

\begin{equation*} \vert \alpha_{k+1} - \alpha \vert \leq C \vert \alpha_{k} - \alpha \vert ^2 \end{equation*}

for some constant $C \text{.}$ In other words, if

\begin{equation*} \lim_{k \rightarrow \infty} \frac {\vert \alpha_{k+1} - \alpha \vert} {\vert \alpha_{k} - \alpha \vert^2} = C . \end{equation*}
• $\alpha_k$ is said to converge cubically to $\alpha$ if for large enough $k$

\begin{equation*} \vert \alpha_{k+1} - \alpha \vert \leq C \vert \alpha_{k} - \alpha \vert ^3 \end{equation*}

for some constant $C \text{.}$ In other words, if

\begin{equation*} \lim_{k \rightarrow \infty} \frac {\vert \alpha_{k+1} - \alpha \vert} {\vert \alpha_{k} - \alpha \vert^3} = C . \end{equation*}

Linear convergence can be slow. Let's say that for $k \geq K$ we observe that

\begin{equation*} \vert \alpha_{k+1} - \alpha \vert \leq C \vert \alpha_{k} - \alpha \vert . \end{equation*}

Then, clearly, $\vert \alpha_{k+n} - \alpha\vert \leq C^n \vert \alpha_{k} - \alpha\vert \text{.}$ If $C = 0.99 \text{,}$ progress may be very, very slow. If $\vert \alpha_k - \alpha \vert = 1 \text{,}$ then

\begin{equation*} \begin{array}{rcl} \vert \alpha_{k+1} - \alpha \vert \amp\leq\amp 0.99000 \\ \vert \alpha_{k+2} - \alpha \vert \amp\leq\amp 0.98010 \\ \vert \alpha_{k+3} - \alpha \vert \amp\leq\amp 0.97030 \\ \vert \alpha_{k+4} - \alpha \vert \amp\leq\amp 0.96060 \\ \vert \alpha_{k+5} - \alpha \vert \amp\leq\amp 0.95099 \\ \vert \alpha_{k+6} - \alpha \vert \amp\leq\amp 0.94148 \\ \vert \alpha_{k+7} - \alpha \vert \amp\leq\amp 0.93206 \\ \vert \alpha_{k+8} - \alpha \vert \amp\leq\amp 0.92274 \\ \vert \alpha_{k+9} - \alpha \vert \amp\leq\amp 0.91351 \end{array} \end{equation*}

\begin{equation*} \begin{array}{rcl} \vert \alpha_{k+1} - \alpha\vert \amp\leq\amp C \vert \alpha_{k} - \alpha\vert ^2 \\ \vert \alpha_{k+2} - \alpha\vert \amp\leq\amp C \vert \alpha_{k+1} - \alpha\vert ^2 \leq C ( C \vert \alpha_{k} - \alpha\vert ^2 )^2 = C^3 \vert \alpha_{k} - \alpha\vert ^4 \\ \vert \alpha_{k+3} - \alpha\vert \amp\leq\amp C \vert \alpha_{k+2} - \alpha\vert ^2 \leq C ( C^3 \vert \alpha_{k} - \alpha\vert ^4 )^2 = C^7 \vert \alpha_{k} - \alpha\vert ^8 \\ \amp \vdots \amp \\ \vert \alpha_{k+n} - \alpha\vert \amp\leq\amp C^{2^n-1} \vert \alpha_{k} - \alpha\vert ^{2^n} \end{array} \end{equation*}

Even if $C = 0.99$ and $\vert \alpha_k - \alpha \vert = 1 \text{,}$ then

\begin{equation*} \begin{array}{rcl} \vert \alpha_{k+1} - \alpha \vert \amp\leq\amp 0.99000 \\ \vert \alpha_{k+2} - \alpha \vert \amp\leq\amp 0.970299 \\ \vert \alpha_{k+3} - \alpha \vert \amp\leq\amp 0.932065 \\ \vert \alpha_{k+4} - \alpha \vert \amp\leq\amp 0.860058 \\ \vert \alpha_{k+5} - \alpha \vert \amp\leq\amp 0.732303 \\ \vert \alpha_{k+6} - \alpha \vert \amp\leq\amp 0.530905 \\ \vert \alpha_{k+7} - \alpha \vert \amp\leq\amp 0.279042 \\ \vert \alpha_{k+8} - \alpha \vert \amp\leq\amp 0.077085 \\ \vert \alpha_{k+9} - \alpha \vert \amp\leq\amp 0.005882 \\ \vert \alpha_{k+10} - \alpha \vert \amp\leq\amp 0.000034 \end{array} \end{equation*}

If we consider $\alpha$ the correct result then, eventually, the number of correct digits roughly doubles in each iteration. This can be explained as follows: If $\vert \alpha_{k} - \alpha \vert \lt 1 \text{,}$ then the number of correct decimal digits is given by

\begin{equation*} - \log_{10} \vert \alpha_{k} - \alpha \vert . \end{equation*}

Since $\log_{10}$ is a monotonically increasing function,

\begin{equation*} \begin{array}{l} \log_{10} \vert \alpha_{k+1} - \alpha \vert \\ ~~~\leq ~~~~\\ \log_{10} C \vert \alpha_{k} - \alpha \vert^2 \\ ~~~ = ~~~~ \\ \log_{10}(C) + 2\log_{10} \vert \alpha_{k} - \alpha \vert \\ ~~~ \leq ~~~~\\ 2\log_{10}\vert \alpha_{k} - \alpha \vert \end{array} \end{equation*}

and hence

\begin{equation*} \begin{array}[t]{c} \underbrace{ - \log_{10} \vert \alpha_{k+1} - \alpha \vert } \\ \mbox{number of correct} \\ \mbox{digits in }\alpha_{k+1} \end{array} \geq 2( \begin{array}[t]{c} \underbrace{ - \log_{10}\vert \alpha_{k} - \alpha \vert } \\ \mbox{number of correct} \\ \mbox{digits in }\alpha_{k} \end{array} ). \end{equation*}

Cubic convergence is dizzyingly fast: Eventually the number of correct digits triples from one iteration to the next.

For our analysis for the convergence of the Power Method, we define a convenient norm.

###### Homework9.3.2.1.

Let $X \in \C^{m \times m}$ be nonsingular. Define $\| \cdot \|_{X^{-1}} : \C^m \rightarrow \R$ by $\| y \|_{X^{-1}} = \| X^{-1} y \|$ for some given norm $\| \cdot \|: \C^m \rightarrow \R \text{.}$

ALWAYS/SOMETIMES/NEVER: $\| \cdot \|_{X^{-1}}$ is a norm.

Solution

We need to show that

• If $y \neq 0$ then $\| y \|_{X^{-1}} \gt 0 \text{:}$

Let $y \neq 0$ and $z = X^{-1} y \text{.}$ Then $z \neq 0$ since $X$ is nonsingular. Hence

\begin{equation*} \| y \|_{X^{-1}} = \| X^{-1} y \| = \| z \| \gt 0 . \end{equation*}
• If $\alpha \in \C$ and $y \in \Cm$ then $\| \alpha y \|_{X^{-1}} = \vert \alpha \vert \| y \|_{X^{-1}} \text{:}$

\begin{equation*} \| \alpha y \|_{X^{-1}} = \| {X^{-1}} ( \alpha y ) \| = \| \alpha {X^{-1}} y \| = \vert \alpha \vert \| {X^{-1}} y \| = \vert \alpha \vert \| y \|_{X^{-1}} . \end{equation*}
• If $x, y \in \Cm$ then $\| x + y \|_{X^{-1}} \leq \| x \|_{X^{-1}} + \| y \|_{X^{-1}} \text{:}$

\begin{equation*} \begin{array}{l} \| x + y \|_{X^{-1}}\\ ~~~=~~~~\\ \| {X^{-1}} ( x + y ) \| \\ ~~~=~~~~\\ \| {X^{-1}} x + X^{-1} y \| \\ ~~~\leq~~~~\\ \| {X^{-1}} x \| + \| {X^{-1}} y \| \\ ~~~=~~~~\\ \| x \|_{X^{-1}} + \| y \|_{X^{-1}} . \end{array} \end{equation*}

What do we learn from this exercise? Recall that a vector $z$ can alternatively be written as $X ( X^{-1} z )$ so that the vector $\hat z = X^{-1} z$ tells you how to represent the vector $z$ in the basis given by the columns of $X \text{.}$ What the exercise tells us is that if we measure a vector by applying a known norm in a new basis, then that is also a norm.

With this insight, we can perform our convergence analysis:

\begin{equation*} \begin{array}{l} v^{(k)} - \psi_0 x_0 \\ ~~~=~~~~\\ A^{k} v^{(0)} / \lambda_0^{k} - \psi_0 x_0 \\ ~~~=~~~~\\ X \left( \begin{array}{ c | c | c | c } 1 \amp 0 \amp \cdots \amp 0 \\ \hline 0 \amp \left( \frac{\lambda_1}{\lambda_0} \right)^{k} \amp \cdots \amp 0 \\ \hline \vdots \amp \vdots \amp \ddots \amp \vdots \\ \hline 0 \amp 0 \amp \cdots \amp \left( \frac{\lambda_{m-1}}{\lambda_0} \right)^k \end{array} \right) X^{-1} v^{(0)} - \psi_0 x_0 \\ ~~~=~~~~\\ X \left( \begin{array}{ c | c | c | c } 1 \amp 0 \amp \cdots \amp 0 \\ \hline 0 \amp \left( \frac{\lambda_1}{\lambda_0} \right)^k \amp \cdots \amp 0 \\ \hline \vdots \amp \vdots \amp \ddots \amp \vdots \\ \hline 0 \amp 0 \amp \cdots \amp \left( \frac{\lambda_{m-1}}{\lambda_0} \right)^k \end{array} \right) y - \psi_0 x_0 \\ ~~~=~~~~\\ X \left( \begin{array}{ c | c | c | c } 0 \amp 0 \amp \cdots \amp 0 \\ \hline 0 \amp \left( \frac{\lambda_1}{\lambda_0} \right)^k \amp \cdots \amp 0 \\ \hline \vdots \amp \vdots \amp \ddots \amp \vdots \\ \hline 0 \amp 0 \amp \cdots \amp \left( \frac{\lambda_{m-1}}{\lambda_0} \right)^k \end{array} \right) y \end{array} \end{equation*}

Hence

\begin{equation*} X^{-1} ( v^{(k)} - \psi_0 x_0 )= \left( \begin{array}{ c | c | c | c } 0 \amp 0 \amp \cdots \amp 0 \\ \hline 0 \amp \left( \frac{\lambda_1}{\lambda_0} \right)^k \amp \cdots \amp 0 \\ \hline \vdots \amp \vdots \amp \ddots \amp \vdots \\ \hline 0 \amp 0 \amp \cdots \amp \left( \frac{\lambda_{m-1}}{\lambda_0} \right)^k \end{array} \right) y \end{equation*}

and

\begin{equation*} X^{-1} (v^{(k+1)} - \psi_0 x_0 ) = \left( \begin{array}{ c | c | c | c } 0 \amp 0 \amp \cdots \amp 0 \\ \hline 0 \amp \frac{\lambda_1}{\lambda_0} \amp \cdots \amp 0 \\ \hline \vdots \amp \vdots \amp \ddots \amp \vdots \\ \hline 0 \amp 0 \amp \cdots \amp \frac{\lambda_{m-1}}{\lambda_0} \end{array} \right) X^{-1} ( v^{(k)} - \psi_0 x_0 ). \end{equation*}

Now, let $\| \cdot \|$ be a p-norm and $\| \cdot \|_{X^{-1}}$ as defined in Homework 9.3.2.1. Then

\begin{equation*} \begin{array}{rcl} \| v^{(k+1)} - \psi_0 x_0 \|_{X^{-1}} \amp=\amp \| X^{-1} ( v^{(k+1)} - \psi_0 x_0 )\| \\ \amp=\amp \left\| \left( \begin{array}{ c | c | c | c } 0 \amp 0 \amp \cdots \amp 0 \\ \hline 0 \amp \frac{\lambda_1}{\lambda_0} \amp \cdots \amp 0 \\ \hline \vdots \amp \vdots \amp \ddots \amp \vdots \\ \hline 0 \amp 0 \amp \cdots \amp \frac{\lambda_{m-1}}{\lambda_0} \end{array} \right) X^{-1} ( v^{(k)} - \psi_0 x_0 ) \right\| \\ \amp\leq\amp \left\vert \frac{\lambda_1}{\lambda_0} \right\vert \| X^{-1} ( v^{(k)} - \psi_0 x_0 ) \| = \left\vert \frac{\lambda_1}{\lambda_0} \right\vert \| v^{(k)} - \psi_0 x_0 \|_{X^{-1}}. \end{array} \end{equation*}

This shows that, in this norm, the convergence of $v^{(k)}$ to $\psi_0 x_0$ is linear: The difference between current approximation, $v^{(k)} \text{,}$ and the eventual vector in the direction of the desired eigenvector, $\psi x_0 \text{,}$ is reduced by at least a constant factor in each iteration.

Now, what if

\begin{equation*} \vert \lambda_0 \vert = \cdots = \vert \lambda_{n-1} \vert \gt \vert \lambda_n \vert \geq \ldots \geq \vert \lambda_{m-1} \vert ? \end{equation*}

By extending the above analysis one can easily show that $v^{(k)}$ will converge to a vector in the subspace spanned by the eigenvectors associated with $\lambda_0, \ldots , \lambda_{n-1} \text{.}$

An important special case is when $n = 2\text{:}$ if $A$ is real-valued then $\lambda_0$ may be complex-valued in which case its conjugate, $\bar \lambda_0 \text{,}$ is also an eigenvalue and hence has the same magnitude as $\lambda_0 \text{.}$ We deduce that $v^{(k)}$ will always be in the space spanned by the eigenvectors corresponding to $\lambda_0$ and $\bar \lambda_0 \text{.}$