## Subsection11.3.3Jacobi's method for computing the Singular Value Decomposition

Just like the QR algorithm for computing the Spectral Decomposition was modified to compute the SVD, so can the Jacobi Method for computing the Spectral Decomposition.

The insight is very simple. Let $A \in \mathbb R^{m \times n}$ and partition it by columns:

\begin{equation*} A = \left( \begin{array}{c | c | c | c} a_0 \amp a_1 \amp \cdots \amp a_{n-1} \end{array} \right). \end{equation*}

One could form $B = A^T A$ and then compute Jacobi rotations to diagonalize it:

\begin{equation*} \begin{array}[t]{c} \underbrace{ \cdots J_{3,1}^T J_{2,1}^T } \\ Q^T \end{array} B \begin{array}[t]{c} \underbrace{ J_{2,1} J_{3,1} \cdots } \\ Q \end{array} = D. \end{equation*}

We recall that if we order the columns of $Q$ and diagonal elements of $D$ appropriately, then choosing $V = Q$ and $\Sigma = D^{1/2}$ yields

\begin{equation*} A = U \Sigma V^T = U D^{1/2} Q^T \end{equation*}

or, equivalently,

\begin{equation*} A Q = U \Sigma = U D^{1/2}. \end{equation*}

This means that if we apply the Jacobi rotations $J_{2,1}, J_{3,1}, \ldots$ from the right to $A \text{,}$

\begin{equation*} U D^{1/2} = ( ( A J_{2,1} ) J_{3,1} ) \cdots, \end{equation*}

then, once $B$ has become (approximately) diagonal, the columns of $\widehat A = ( ( A J_{2,1} ) J_{3,1} ) \cdots$ are mutually orthogonal. By scaling them to have length one, setting $\Sigma = \diag{ \| \widehat a_0 \|_2, \| \widehat a_1 \|_2, \ldots , \| \widehat a_{n-1} \|_2} \text{,}$ we find that

\begin{equation*} U = \widehat A \Sigma^{-1} = A Q (D^{1/2})^{-1}. \end{equation*}

The only problem is that in forming $B \text{,}$ we may introduce unnecessary error since it squares the condition number.

Here is a more practical algorithm. We notice that

\begin{equation*} B = A^T A = \left( \begin{array}{c c c c} a_0^T a_0 \amp a_0^T a_1 \amp \cdots \amp a_0^T a_{n-1} \\ a_1^T a_0 \amp a_1^T a_1 \amp \cdots \amp a_1^T a_{n-1} \\ \vdots \amp \vdots \amp \amp \vdots \\ a_{n-1}^T a_0 \amp a_{n-1}^T a_1 \amp \cdots \amp a_{n-1}^T a_{n-1} \end{array} \right). \end{equation*}

We observe that we don't need to form all of $B \text{.}$ When it is time to compute $J_{i,j} \text{,}$ we need only compute

\begin{equation*} \left( \begin{array}{cc} \beta_{i,i} \amp \beta_{j,i} \\ \beta_{j,i} \amp \beta_{j,j} \end{array}\right) = \left( \begin{array}{cc} a_i^T a_i \amp a_j^T a_i \\ a_j^T a_i \amp a_j^T a_j \end{array}\right) , \end{equation*}

from which $J_{i,j}$ can be computed. By instead applying this Jacobi rotation to $B \text{,}$ we observe that

\begin{equation*} J_{i,j}^T B J_{i,j} = J_{i,j}^T A^T A J_{i,j} = ( A J_{i,j} )^T ( A J_{i,j} ) \end{equation*}

and hence the Jacobi rotation can instead be used to take linear combinations of the $i$th and $j$th columns of $A \text{:}$

\begin{equation*} \left( \begin{array}{c | c} a_i \amp a_j \end{array} \right) := \left( \begin{array}{c | c} a_i \amp a_j \end{array} \right) \left( \begin{array}{c c} \gamma_{i,j} \amp -\sigma_{i,j} \\ \sigma_{i,j} \amp \gamma_{i,j} \end{array} \right) . \end{equation*}

We have thus outlined an algorithm:

• Starting with matrix $A \text{,}$ compute a sequence of Jacobi rotations (e.g., corresponding to a column-cyclic Jacobi method) until the off-diagonal elements of $A^T A$ (parts of which are formed as Jacobi rotations are computed) become small. Every time a Jacobi rotation is computed, it updates the appropriate columns of $A \text{.}$

• Accumulate the Jacobi rotations into matrix $V \text{,}$ by applying them from the right to an identity matrix:

\begin{equation*} V = (( I \times J_{2,1}) J_{3,1} ) \cdots \end{equation*}
• Upon completion,

\begin{equation*} \Sigma = \diag{ \| a_0 \|_2, \| a_1 \|_2, \ldots , \| a_{n-1} \|_2} \end{equation*}

and

\begin{equation*} U = A \Sigma^{-1}, \end{equation*}

meaning that each column of the updated $A$ is divided by its length.

• If necessary, reorder the columns of $U$ and $V$ and the diagonal elements of $\Sigma \text{.}$

Obviously, there are variations on this theme. Such methods are known as one-sided Jacobi methods.