Skip to main content

Subsection 10.4.5 Jacobi algorithm for the symmetric eigenvalue problem

\begin{equation*} \begin{array}{c c c c c c c c} \amp \amp \mbox{Sweep 1} \amp \amp \amp \amp \amp\\ \hline \begin{array}{c c c c} \color{red} \times \amp \color{blue} 0 \amp \color{red} \times \amp \color{red} \times \\ \color{blue} 0 \amp \color{red} \times \amp \color{red} \times \amp \color{red} \times \\ \color{red} \times \amp \color{red} \times \amp \times \amp \times \\ \color{red} \times \amp \color{red} \times \amp \times \amp \times \\ \end{array} \amp \rightarrow \amp \begin{array}{c c c c} \color{red} \times \amp \color{red} \times \amp \color{blue} 0 \amp \color{red} \times \\ \color{red} \times \amp \times \amp \color{red} \times \amp \times \\ \color{blue} 0 \amp \color{red} \times \amp \color{red} \times \amp \color{red} \times \\ \color{red} \times \amp \times \amp \color{red} \times \amp \times \\ \end{array} \amp \rightarrow \amp \begin{array}{c c c c} \color{red} \times \amp \color{red} \times \amp \color{red} \times \amp\color{blue} 0 \\ \color{red} \times \amp \times \amp \times \amp \color{red} \times \\ \color{red} \times \amp \times \amp \times \amp \color{red} \times \\ \color{blue} 0 \amp \color{red} \times \amp \color{red} \times \amp \color{red} \times \\ \end{array} \amp \rightarrow \\ % \mbox{zero } (1,0) \amp \amp \mbox{zero } (2,0) \amp \amp \mbox{zero } (3,0) \\ \\ \begin{array}{c c c c} \times \amp \color{red} \times \amp \color{red} \times \amp 0 \\ \color{red} \times \amp \color{red} \times \amp \color{blue} 0 \amp \color{red} \times \\ \color{red} \times \amp \color{blue} 0 \amp \color{red} \times \amp \color{red} \times \\ 0 \amp \color{red} \times \amp \color{red} \times \amp \times \\ \end{array} \amp \rightarrow \amp \begin{array}{c c c c} \times \amp \color{red} \times \amp \times \amp \color{red} \times \\ \color{red} \times \amp \color{red} \times \amp \color{red} \times \amp\color{blue} 0 \\ \times \amp \color{red} \times \amp \times \amp \color{red} \times \\ \color{red} \times \amp \color{blue} 0 \amp \color{red} \times \amp \color{red} \times \\ \end{array} \amp \rightarrow \amp \begin{array}{c c c c} \times \amp \times \amp \color{red} \times \amp \color{red} \times \\ \times \amp \times \amp \color{red} \times \amp \color{red} \times \\ \color{red} \times \amp \color{red} \times \amp \color{red} \times \amp\color{blue} 0 \\ \color{red} \times \amp \color{red} \times \amp \color{blue} 0 \amp \color{red} \times \\ \end{array} \amp \rightarrow \\ \mbox{zero } (2,1) \amp \amp \mbox{zero } (3,1) \amp \amp \mbox{zero } (3,2) \end{array} \end{equation*}
\begin{equation*} \begin{array}{c c c c c c c c} \amp \amp \mbox{Sweep 2} \amp \amp \amp \amp \amp\\ \hline \begin{array}{c c c c} \color{red} \times \amp \color{blue} 0 \amp \color{red} \times \amp \color{red} \times \\ \color{blue} 0 \amp \color{red} \times \amp \color{red} \times \amp \color{red} \times \\ \color{red} \times \amp \color{red} \times \amp \times \amp \times \\ \color{red} \times \amp \color{red} \times \amp \times \amp \times \\ \end{array} \amp \rightarrow \amp \begin{array}{c c c c} \color{red} \times \amp \color{red} \times \amp \color{blue} 0 \amp \color{red} \times \\ \color{red} \times \amp \times \amp \color{red} \times \amp \times \\ \color{blue} 0 \amp \color{red} \times \amp \color{red} \times \amp \color{red} \times \\ \color{red} \times \amp \times \amp \color{red} \times \amp \times \\ \end{array} \amp \rightarrow \amp \begin{array}{c c c c} \color{red} \times \amp \color{red} \times \amp \color{red} \times \amp\color{blue} 0 \\ \color{red} \times \amp \times \amp \times \amp \color{red} \times \\ \color{red} \times \amp \times \amp \times \amp \color{red} \times \\ \color{blue} 0 \amp \color{red} \times \amp \color{red} \times \amp \color{red} \times \\ \end{array} \amp \rightarrow \\ \mbox{zero } (1,0) \amp \amp \mbox{zero } (2,0) \amp \amp \mbox{zero } (3,0) \\ \\ \begin{array}{c c c c} \times \amp \color{red} \times \amp \color{red} \times \amp 0 \\ \color{red} \times \amp \color{red} \times \amp \color{blue} 0 \amp \color{red} \times \\ \color{red} \times \amp \color{blue} 0 \amp \color{red} \times \amp \color{red} \times \\ 0 \amp \color{red} \times \amp \color{red} \times \amp \times \\ \end{array} \amp \rightarrow \amp \begin{array}{c c c c} \times \amp \color{red} \times \amp \times \amp \color{red} \times \\ \color{red} \times \amp \color{red} \times \amp \color{red} \times \amp\color{blue} 0 \\ \times \amp \color{red} \times \amp \times \amp \color{red} \times \\ \color{red} \times \amp \color{blue} 0 \amp \color{red} \times \amp \color{red} \times \\ \end{array} \amp \rightarrow \amp \begin{array}{c c c c} \times \amp \times \amp \color{red} \times \amp \color{red} \times \\ \times \amp \times \amp \color{red} \times \amp \color{red} \times \\ \color{red} \times \amp \color{red} \times \amp \color{red} \times \amp\color{blue} 0 \\ \color{red} \times \amp \color{red} \times \amp \color{blue} 0 \amp \color{red} \times \\ \end{array} \amp \rightarrow \\ \mbox{zero } (2,1) \amp \amp \mbox{zero } (3,1) \amp \amp \mbox{zero } (3,2) \end{array} \end{equation*}
Figure 10.4.5.1. Column-cyclic Jacobi algorithm.

The oldest algorithm for computing the eigenvalues and eigenvectors of a matrix is due to Jacobi and dates back to 1846

  • [16] C. G. J. Jacobi, Über ein leichtes Verfahren, die in der Theorie der Sä kular-störungen vorkommenden Gleichungen numerisch aufzulösen, Crelle's Journal 30, 51-94 (1846).

This is a method that keeps resurfacing, since it parallelizes easily. The operation count tends to be higher (by a constant factor) than that of reduction to tridiagonal form followed by the tridiagonal QR algorithm.

The idea is as follows: Given a symmetric \(2 \times 2 \) matrix

\begin{equation*} A_{31} = \left( \begin{array}{c c} \alpha_{11} \amp \alpha_{13} \\ \alpha_{31} \amp \alpha_{33} \end{array} \right) \end{equation*}

There exists a rotation (which is unitary)

\begin{equation*} J_{31} = \left( \begin{array}{c c} \gamma_{11} \amp - \sigma_{13} \\ \sigma_{31} \amp \gamma_{33} \end{array} \right) \end{equation*}

such that

\begin{equation*} J_{31} A_{31} J_{31}^T = \left( \begin{array}{c c} \color{blue}{\gamma_{11}} \amp\color{blue}{- \sigma_{31}} \\ \color{blue}{\sigma_{31}} \amp \color{blue}{\gamma_{33}} \end{array} \right) \left( \begin{array}{c c} \color{blue}{\alpha_{11}} \amp \color{blue}{\alpha_{31}} \\ \color{blue}{\alpha_{31}} \amp \color{blue}{\alpha_{33}} \end{array} \right) \left( \begin{array}{c c} \color{blue}{\gamma_{11}} \amp\color{blue}{- \sigma_{31}} \\ \color{blue}{\sigma_{31}} \amp \color{blue}{\gamma_{33}} \end{array} \right)^T = \left( \begin{array}{c c} \color{blue}{\widehat \alpha_{11}} \amp \color{blue}{ 0} \\ \color{blue}{ 0} \amp \color{blue}{\widehat \alpha_{33}} \end{array} \right). \end{equation*}

We know this exists since the Spectral Decomposition of the \(2 \times 2 \) matrix exists. Such a rotation is called a Jacobi rotation. (Notice that it is different from a Givens' rotation Subsection 10.3.3 because it diagonalizes a \(2 \times 2 \) matrix when used as a unitary similarity transformation. By contrast, a Givens' rotation zeroes an element when applied from one side of a matrix.)

Homework 10.4.5.1.

In the above discussion, show that \(\alpha_{11}^2+ 2 \alpha_{31}^2 + \alpha_{33}^2 = \widehat \alpha_{11}^2 + \widehat \alpha_{33}^2 \text{.}\)

Solution

If \(\widehat A_{31} = J_{31} A_{31} J_{31}^T \) then \(\| \widehat A_{31} \|_F^2 = \| A_{31} \|_F^2 \) because multiplying on the left and/or the right by a unitary matrix preserves the Frobenius norm of a matrix. Hence

\begin{equation*} \left\| \left( \begin{array}{c c} \color{blue}{ \alpha_{11}} \amp \color{blue}{ \alpha_{31}} \\ \color{blue}{ \alpha_{31}} \amp \color{blue}{ \alpha_{33}} \end{array} \right) \right\|_F^2 = \left\| \left( \begin{array}{c c} \color{blue}{ \widehat \alpha_{11}} \amp \color{blue}{ 0} \\ \color{blue}{ 0} \amp \color{blue}{ \widehat \alpha_{33} } \end{array} \right) \right\|_F^2. \end{equation*}

But

\begin{equation*} \left\| \left( \begin{array}{c c} \color{blue} \alpha_{11} \amp \color{blue} \alpha_{31} \\ \color{blue} \alpha_{31} \amp \color{blue} \alpha_{33} \end{array} \right) \right\|_F^2 = \alpha_{11}^2 + 2 \alpha_{31}^2 + \alpha_{33}^2 \end{equation*}

and

\begin{equation*} \left\| \left( \begin{array}{c c} \color{blue}{ \widehat \alpha_{11}} \amp \color{blue} 0 \\ \color{blue} 0 \amp \color{blue}{ \widehat \alpha_{33}} \end{array} \right) \right\|_F^2 = \widehat \alpha_{11}^2 + + \widehat \alpha_{33}^2, \end{equation*}

which proves the result.

Jacobi rotation rotations can be used to selectively zero off-diagonal elements by observing the following:

\begin{equation*} \begin{array}{l} J A J^T = \\ ~~~ \begin{array}[t]{r c l} \left( \begin{array}{c | c | c | c | c } I \amp 0 \amp 0 \amp 0 \amp 0 \\ \hline 0 \amp \color{blue}{ \gamma_{11}} \amp 0 \amp \color{blue}{ -\sigma_{31}} \amp 0 \\ \hline 0 \amp 0 \amp I \amp 0 \amp 0 \\ \hline 0 \amp \color{blue}{ \sigma_{31}} \amp 0 \amp \color{blue}{ \gamma_{33}} \amp 0 \\ \hline 0 \amp 0 \amp 0 \amp 0 \amp I \end{array} \right) \amp \left( \begin{array}{c | c | c | c | c } A_{00} \amp \color{red}{ a_{10}} \amp A_{20}^T \amp \color{red}{ a_{30}} \amp A_{40}^T \\ \hline \color{red}{ a_{10}^T} \amp \color{blue}{ \alpha_{11}} \amp \color{red}{ a_{21}^T} \amp \color{blue}{ \alpha_{31}} \amp \color{red}{ a_{41}^T} \\ \hline A_{20} \amp \color{red}{ a_{21}} \amp A_{22} \amp \color{red}{ a_{32}} \amp A_{42}^T \\ \hline \color{red}{ a_{30}^T} \amp \color{blue}{ \alpha_{31}} \amp \color{red}{ a_{32}^T} \amp \color{blue}{ \alpha_{33}} \amp \color{red}{ a_{43}^T} \\ \hline A_{40} \amp \color{red}{ a_{41}} \amp A_{42} \amp \color{red}{ a_{43}} \amp A_{44} \end{array} \right) \amp \left( \begin{array}{c | c | c | c | c } I \amp 0 \amp 0 \amp 0 \amp 0 \\ \hline 0 \amp \color{blue}{ \gamma_{11}} \amp 0 \amp \color{blue}{ -\sigma_{13}} \amp 0 \\ \hline 0 \amp 0 \amp I \amp 0 \amp 0 \\ \hline 0 \amp \color{blue}{ \sigma_{31}} \amp 0 \amp \color{blue}{ \gamma_{33}} \amp 0 \\ \hline 0 \amp 0 \amp 0 \amp 0 \amp I \end{array} \right)^T \\ \\ ~~~= \amp\left( \begin{array}{c | c | c | c | c } A_{00} \amp \color{red}{\widehat a_{10}} \amp A_{20}^T \amp \color{red}{\widehat a_{30}} \amp A_{40}^T \\ \hline \color{red}{ \widehat a_{10}^T} \amp \color{blue}{ \widehat \alpha_{11}} \amp\color{red}{\widehat a_{21}^T} \amp \color{blue}{ 0} \amp \color{red}{ \widehat a_{41}^T} \\ \hline A_{20} \amp \color{red}{ \widehat a_{21}} \amp A_{22} \amp \color{red}{ \widehat a_{32}} \amp A_{42}^T \\ \hline \color{red}{ \widehat a_{30}^T} \amp \color{blue}{ 0} \amp \color{red}{ \widehat a_{32}^T} \amp \color{blue}{ \widehat \alpha_{33}} \amp \color{red}{ \widehat a_{43}^T} \\ \hline A_{40} \amp \color{red}{ \widehat a_{41}} \amp A_{42} \amp \color{red}{ \widehat a_{43}} \amp A_{44} \end{array} \right) \amp = \widehat A , \end{array} \end{array} \end{equation*}

where

\begin{equation*} \left( \begin{array}{c c} \color{blue}{ \gamma_{11}} \amp\color{blue}{ - \sigma_{31}} \\ \color{blue}{ \sigma_{31}} \amp \color{blue}{ \gamma_{33}} \end{array} \right) \left( \begin{array}{c | c | c | c | c } \color{red}{ a_{10}^T} \amp ~~ \amp \color{red}{ a_{21}^T} \amp ~~ \amp \color{red}{ a_{41}^T} \\ \color{red}{ a_{30}^T} \amp ~~ \amp \color{red}{ a_{32}^T} \amp ~~ \amp \color{red}{ a_{43}^T} \end{array} \right) = \left( \begin{array}{c | c | c | c | c } \color{red}{ \widehat a_{10}^T} \amp ~~ \amp \color{red}{ \widehat a_{21}^T} \amp ~~ \amp \color{red}{ \widehat a_{41}^T} \\ \color{red}{ \widehat a_{30}^T} \amp ~~ \amp \color{red}{ \widehat a_{32}^T} \amp ~~ \amp \color{red}{ \widehat a_{43}^T} \end{array} \right). \end{equation*}

Importantly,

\begin{equation*} \begin{array}{rcl} a_{10}^T a_{10} + a_{30}^T a_{30} \amp= \amp \widehat a_{10}^T \widehat a_{10} + \widehat a_{30}^T \widehat a_{30} \\ a_{21}^T a_{21} + a_{32}^T a_{32} \amp= \amp \widehat a_{21}^T \widehat a_{21} + \widehat a_{32}^T \widehat a_{32} \\ a_{41}^T a_{41} + a_{43}^T a_{43} \amp= \amp \widehat a_{41}^T \widehat a_{41} + \widehat a_{43}^T \widehat a_{43}. \end{array} \end{equation*}

What this means is that if one defines \({\rm off}( A ) \) as the square of the Frobenius norm of the off-diagonal elements of \(A \text{,}\)

\begin{equation*} {\rm off}( A ) = \| A \|_F^2 - \| \diag{ A } \|_F^2 , \end{equation*}

then \({\rm off}( \widehat A ) = {\rm off}( A ) - 2 \alpha_{31}^2 \text{.}\)

  • The good news: every time a Jacobi rotation is used to zero an off-diagonal element, \({\rm off} ( A )\) decreases by twice the square of that element.

  • The bad news: a previously introduced zero may become nonzero in the process.

The original algorithm developed by Jacobi searched for the largest (in absolute value) off-diagonal element and zeroed it, repeating this processess until all off-diagonal elements were small. The algorithm was applied by hand by one of his students, Seidel (of Gauss-Seidel fame). The problem with this is that searching for the largest off-diagonal element requires \(O( n^2) \) comparisons. Computing and applying one Jacobi rotation as a similarity transformation requires \(O(n) \) flops. For large \(n \) this is not practical. Instead, it can be shown that zeroing the off-diagonal elements by columns (or rows) also converges to a diagonal matrix. This is known as the column-cyclic Jacobi algorithm. We illustrate this in Figure 10.4.5.1.