## Subsection10.4.5Jacobi algorithm for the symmetric eigenvalue problem

The oldest algorithm for computing the eigenvalues and eigenvectors of a matrix is due to Jacobi and dates back to 1846

• [16] C. G. J. Jacobi, Über ein leichtes Verfahren, die in der Theorie der Sä kular-störungen vorkommenden Gleichungen numerisch aufzulösen, Crelle's Journal 30, 51-94 (1846).

This is a method that keeps resurfacing, since it parallelizes easily. The operation count tends to be higher (by a constant factor) than that of reduction to tridiagonal form followed by the tridiagonal QR algorithm.

The idea is as follows: Given a symmetric $2 \times 2$ matrix

\begin{equation*} A_{31} = \left( \begin{array}{c c} \alpha_{11} \amp \alpha_{13} \\ \alpha_{31} \amp \alpha_{33} \end{array} \right) \end{equation*}

There exists a rotation (which is unitary)

\begin{equation*} J_{31} = \left( \begin{array}{c c} \gamma_{11} \amp - \sigma_{13} \\ \sigma_{31} \amp \gamma_{33} \end{array} \right) \end{equation*}

such that

\begin{equation*} J_{31} A_{31} J_{31}^T = \left( \begin{array}{c c} \color{blue}{\gamma_{11}} \amp\color{blue}{- \sigma_{31}} \\ \color{blue}{\sigma_{31}} \amp \color{blue}{\gamma_{33}} \end{array} \right) \left( \begin{array}{c c} \color{blue}{\alpha_{11}} \amp \color{blue}{\alpha_{31}} \\ \color{blue}{\alpha_{31}} \amp \color{blue}{\alpha_{33}} \end{array} \right) \left( \begin{array}{c c} \color{blue}{\gamma_{11}} \amp\color{blue}{- \sigma_{31}} \\ \color{blue}{\sigma_{31}} \amp \color{blue}{\gamma_{33}} \end{array} \right)^T = \left( \begin{array}{c c} \color{blue}{\widehat \alpha_{11}} \amp \color{blue}{ 0} \\ \color{blue}{ 0} \amp \color{blue}{\widehat \alpha_{33}} \end{array} \right). \end{equation*}

We know this exists since the Spectral Decomposition of the $2 \times 2$ matrix exists. Such a rotation is called a Jacobi rotation. (Notice that it is different from a Givens' rotation Subsection 10.3.3 because it diagonalizes a $2 \times 2$ matrix when used as a unitary similarity transformation. By contrast, a Givens' rotation zeroes an element when applied from one side of a matrix.)

###### Homework10.4.5.1.

In the above discussion, show that $\alpha_{11}^2+ 2 \alpha_{31}^2 + \alpha_{33}^2 = \widehat \alpha_{11}^2 + \widehat \alpha_{33}^2 \text{.}$

Solution

If $\widehat A_{31} = J_{31} A_{31} J_{31}^T$ then $\| \widehat A_{31} \|_F^2 = \| A_{31} \|_F^2$ because multiplying on the left and/or the right by a unitary matrix preserves the Frobenius norm of a matrix. Hence

\begin{equation*} \left\| \left( \begin{array}{c c} \color{blue}{ \alpha_{11}} \amp \color{blue}{ \alpha_{31}} \\ \color{blue}{ \alpha_{31}} \amp \color{blue}{ \alpha_{33}} \end{array} \right) \right\|_F^2 = \left\| \left( \begin{array}{c c} \color{blue}{ \widehat \alpha_{11}} \amp \color{blue}{ 0} \\ \color{blue}{ 0} \amp \color{blue}{ \widehat \alpha_{33} } \end{array} \right) \right\|_F^2. \end{equation*}

But

\begin{equation*} \left\| \left( \begin{array}{c c} \color{blue} \alpha_{11} \amp \color{blue} \alpha_{31} \\ \color{blue} \alpha_{31} \amp \color{blue} \alpha_{33} \end{array} \right) \right\|_F^2 = \alpha_{11}^2 + 2 \alpha_{31}^2 + \alpha_{33}^2 \end{equation*}

and

\begin{equation*} \left\| \left( \begin{array}{c c} \color{blue}{ \widehat \alpha_{11}} \amp \color{blue} 0 \\ \color{blue} 0 \amp \color{blue}{ \widehat \alpha_{33}} \end{array} \right) \right\|_F^2 = \widehat \alpha_{11}^2 + + \widehat \alpha_{33}^2, \end{equation*}

which proves the result.

Jacobi rotation rotations can be used to selectively zero off-diagonal elements by observing the following:

\begin{equation*} \begin{array}{l} J A J^T = \\ ~~~ \begin{array}[t]{r c l} \left( \begin{array}{c | c | c | c | c } I \amp 0 \amp 0 \amp 0 \amp 0 \\ \hline 0 \amp \color{blue}{ \gamma_{11}} \amp 0 \amp \color{blue}{ -\sigma_{31}} \amp 0 \\ \hline 0 \amp 0 \amp I \amp 0 \amp 0 \\ \hline 0 \amp \color{blue}{ \sigma_{31}} \amp 0 \amp \color{blue}{ \gamma_{33}} \amp 0 \\ \hline 0 \amp 0 \amp 0 \amp 0 \amp I \end{array} \right) \amp \left( \begin{array}{c | c | c | c | c } A_{00} \amp \color{red}{ a_{10}} \amp A_{20}^T \amp \color{red}{ a_{30}} \amp A_{40}^T \\ \hline \color{red}{ a_{10}^T} \amp \color{blue}{ \alpha_{11}} \amp \color{red}{ a_{21}^T} \amp \color{blue}{ \alpha_{31}} \amp \color{red}{ a_{41}^T} \\ \hline A_{20} \amp \color{red}{ a_{21}} \amp A_{22} \amp \color{red}{ a_{32}} \amp A_{42}^T \\ \hline \color{red}{ a_{30}^T} \amp \color{blue}{ \alpha_{31}} \amp \color{red}{ a_{32}^T} \amp \color{blue}{ \alpha_{33}} \amp \color{red}{ a_{43}^T} \\ \hline A_{40} \amp \color{red}{ a_{41}} \amp A_{42} \amp \color{red}{ a_{43}} \amp A_{44} \end{array} \right) \amp \left( \begin{array}{c | c | c | c | c } I \amp 0 \amp 0 \amp 0 \amp 0 \\ \hline 0 \amp \color{blue}{ \gamma_{11}} \amp 0 \amp \color{blue}{ -\sigma_{13}} \amp 0 \\ \hline 0 \amp 0 \amp I \amp 0 \amp 0 \\ \hline 0 \amp \color{blue}{ \sigma_{31}} \amp 0 \amp \color{blue}{ \gamma_{33}} \amp 0 \\ \hline 0 \amp 0 \amp 0 \amp 0 \amp I \end{array} \right)^T \\ \\ ~~~= \amp\left( \begin{array}{c | c | c | c | c } A_{00} \amp \color{red}{\widehat a_{10}} \amp A_{20}^T \amp \color{red}{\widehat a_{30}} \amp A_{40}^T \\ \hline \color{red}{ \widehat a_{10}^T} \amp \color{blue}{ \widehat \alpha_{11}} \amp\color{red}{\widehat a_{21}^T} \amp \color{blue}{ 0} \amp \color{red}{ \widehat a_{41}^T} \\ \hline A_{20} \amp \color{red}{ \widehat a_{21}} \amp A_{22} \amp \color{red}{ \widehat a_{32}} \amp A_{42}^T \\ \hline \color{red}{ \widehat a_{30}^T} \amp \color{blue}{ 0} \amp \color{red}{ \widehat a_{32}^T} \amp \color{blue}{ \widehat \alpha_{33}} \amp \color{red}{ \widehat a_{43}^T} \\ \hline A_{40} \amp \color{red}{ \widehat a_{41}} \amp A_{42} \amp \color{red}{ \widehat a_{43}} \amp A_{44} \end{array} \right) \amp = \widehat A , \end{array} \end{array} \end{equation*}

where

\begin{equation*} \left( \begin{array}{c c} \color{blue}{ \gamma_{11}} \amp\color{blue}{ - \sigma_{31}} \\ \color{blue}{ \sigma_{31}} \amp \color{blue}{ \gamma_{33}} \end{array} \right) \left( \begin{array}{c | c | c | c | c } \color{red}{ a_{10}^T} \amp ~~ \amp \color{red}{ a_{21}^T} \amp ~~ \amp \color{red}{ a_{41}^T} \\ \color{red}{ a_{30}^T} \amp ~~ \amp \color{red}{ a_{32}^T} \amp ~~ \amp \color{red}{ a_{43}^T} \end{array} \right) = \left( \begin{array}{c | c | c | c | c } \color{red}{ \widehat a_{10}^T} \amp ~~ \amp \color{red}{ \widehat a_{21}^T} \amp ~~ \amp \color{red}{ \widehat a_{41}^T} \\ \color{red}{ \widehat a_{30}^T} \amp ~~ \amp \color{red}{ \widehat a_{32}^T} \amp ~~ \amp \color{red}{ \widehat a_{43}^T} \end{array} \right). \end{equation*}

Importantly,

\begin{equation*} \begin{array}{rcl} a_{10}^T a_{10} + a_{30}^T a_{30} \amp= \amp \widehat a_{10}^T \widehat a_{10} + \widehat a_{30}^T \widehat a_{30} \\ a_{21}^T a_{21} + a_{32}^T a_{32} \amp= \amp \widehat a_{21}^T \widehat a_{21} + \widehat a_{32}^T \widehat a_{32} \\ a_{41}^T a_{41} + a_{43}^T a_{43} \amp= \amp \widehat a_{41}^T \widehat a_{41} + \widehat a_{43}^T \widehat a_{43}. \end{array} \end{equation*}

What this means is that if one defines ${\rm off}( A )$ as the square of the Frobenius norm of the off-diagonal elements of $A \text{,}$

\begin{equation*} {\rm off}( A ) = \| A \|_F^2 - \| \diag{ A } \|_F^2 , \end{equation*}

then ${\rm off}( \widehat A ) = {\rm off}( A ) - 2 \alpha_{31}^2 \text{.}$

• The good news: every time a Jacobi rotation is used to zero an off-diagonal element, ${\rm off} ( A )$ decreases by twice the square of that element.

• The bad news: a previously introduced zero may become nonzero in the process.

The original algorithm developed by Jacobi searched for the largest (in absolute value) off-diagonal element and zeroed it, repeating this processess until all off-diagonal elements were small. The algorithm was applied by hand by one of his students, Seidel (of Gauss-Seidel fame). The problem with this is that searching for the largest off-diagonal element requires $O( n^2)$ comparisons. Computing and applying one Jacobi rotation as a similarity transformation requires $O(n)$ flops. For large $n$ this is not practical. Instead, it can be shown that zeroing the off-diagonal elements by columns (or rows) also converges to a diagonal matrix. This is known as the column-cyclic Jacobi algorithm. We illustrate this in Figure 10.4.5.1.