ALAFF Reduction to tridiagonal form

Subsection 10.3.1 Reduction to tridiagonal form

In this section, we see that if \(A^{(0)} \) is a tridiagonal matrix, then so are all \(A^{(k)} \text{.}\) This reduces the cost of each iteration of the QR algorithm from \(O( m^3 ) \) flops to \(O( m ) \) flops if only the eigenvalues are computed and \(O( m^2 ) \) flops if the eigenvectors are also desired. Thus, if matrix \(A \) is first reduced to a tridiagonal matrix via (unitary) similarity transformations, then the cost of finding its eigenvalues and eigenvectors is reduced from \(O( m^4 ) \) to \(O( m^3 ) \) flops. Fortunately, there is an algorithm for reducing a matrix to tridiagonal form that requires \(O( m^3 ) \) operations.

The basic algorithm for reducing a Hermitian matrix to tridiagonal form, overwriting the original matrix with the result, can be explained as follows. We assume that \(A \) is stored only in the lower triangular part of the matrix and that only the diagonal and subdiagonal of the tridiagonal matrix is computed, overwriting those parts of \(A \text{.}\) Finally, the Householder vectors used to zero out parts of \(A\) can overwrite the entries that they annihilate (set to zero), much like we did when computing the Householder QR factorization.

Recall that in Subsubsection 3.3.3.3, we introduced the function

\begin{equation*} \left[ \FlaTwoByOneSingleLine {\rho} { u_2}, \tau \right] := {\rm Housev} \left( \FlaTwoByOneSingleLine { \chi_1 } { x_2 } \right) \end{equation*}

to compute the vector \(u = \FlaTwoByOneSingleLine{1}{u_2}\) that reflects \(x \) into \(\complexone \| x \|_2 e_0 \) so that

\begin{equation*} \left( I - \frac{1}{\tau} \FlaTwoByOneSingleLine{1}{u_2} \FlaTwoByOneSingleLine{1}{u_2}^H \right) \FlaTwoByOneSingleLine { \chi_1 } { x_2 } = \complexone \begin{array}[t]{c} n \underbrace{\| x \|_2}\\ \rho \end{array} e_0. \end{equation*}

We are going to use a variation on this function:

\begin{equation*} \left[ u, \tau \right] := {\rm Housev} \left( x \right) \end{equation*}

implemented by the function

function [ u, tau ] = Housev1( x )

We also reintroduce the notation \(H(x) \) for the transformation \(I - \frac{1}{\tau} u u^H \) where \(u \) and \(\tau \) are computed by \({\rm Housev1}( x ) \text{.}\)

We now describe an algorithm for reducing a Hermitian matrix to tridiagonal form:

Partition \(A \rightarrow \FlaTwoByTwoSingleLine { \alpha_{11} }{ \star } { a_{21} }{ A_{22} }\text{.}\) Here the \(\star \) denotes a part of a matrix that is neither stored nor updated.
Update \([ a_{21}, \tau ] \becomes {\rm Housev1}( a_{21} ) \text{.}\) This overwrites the first element of \(a_{21} \) with \(\complexone \| a_{21} \|_2 \) and the remainder with all but the first element of the Householder vector \(u \text{.}\) Implicitly, the elements below the first element equal zero in the updated matrix \(A \text{.}\)
Update

\begin{equation*} A_{22} := H( a_{21} ) A_{22} H( a_{21} ). \end{equation*}

Since \(A_{22} \) is Hermitian both before and after the update, only the lower triangular part of the matrix needs to be updated.
Continue this process with the updated \(A_{22} \text{.}\)

This approach is illustrated in Figure 10.3.1.1.

\begin{equation*} \begin{array}{| c c c c c} \hline \times \amp {\color{gray} \times} \amp {\color{gray} \times} \amp {\color{gray} \times} \amp {\color{gray} \times} \\ \times \amp \times \amp {\color{gray} \times} \amp {\color{gray} \times} \amp {\color{gray} \times} \\ \times \amp \times \amp \times \amp {\color{gray} \times} \amp {\color{gray} \times} \\ \times \amp \times \amp \times \amp \times \amp {\color{gray} \times} \\ \times \amp \times \amp \times \amp \times \amp \times \end{array} \end{equation*}

\(\begin{array}{c} \\ \\ \\ \longrightarrow \end{array}\)

\begin{equation*} \begin{array}{| c c c c c} \hline \times \amp {\color{gray} \times} \amp {\color{gray} 0} \amp {\color{gray} 0} \amp {\color{gray} 0} \\ \times \amp \times \amp {\color{gray} \times} \amp {\color{gray} \times} \amp {\color{gray} \times} \\ 0 \amp \times \amp \times \amp {\color{gray} \times} \amp {\color{gray} \times} \\ 0 \amp \times \amp \times \amp \times \amp {\color{gray} \times} \\ 0 \amp \times \amp \times \amp \times \amp \times \end{array} \end{equation*}

\(\begin{array}{c} \\ \\ \\ \longrightarrow \end{array}\)

\begin{equation*} \begin{array}{c | c c c c} \times \amp {\color{gray} \times} \amp {\color{gray} 0} \amp {\color{gray} 0} \amp {\color{gray} 0} \\ \hline \times \amp \times \amp {\color{gray} \times} \amp {\color{gray} 0} \amp {\color{gray} 0} \\ 0 \amp \times \amp \times \amp {\color{gray} \times} \amp {\color{gray} \times} \\ 0 \amp 0 \amp \times \amp \times \amp {\color{gray} \times} \\ 0 \amp 0 \amp \times \amp \times \amp \times \end{array} \end{equation*}

Original matrix

First iteration

Second iteration

\(\begin{array}{c} \\ \\ \\ \longrightarrow \end{array}\)

\begin{equation*} \begin{array}{c c | c c c} \times \amp {\color{gray} \times} \amp {\color{gray} 0} \amp {\color{gray} 0} \amp {\color{gray} 0} \\ \times \amp \times \amp {\color{gray} \times} \amp {\color{gray} 0} \amp {\color{gray} 0} \\ \hline 0 \amp \times \amp \times \amp {\color{gray} \times} \amp {\color{gray} 0} \\ 0 \amp 0 \amp \times \amp \times \amp {\color{gray} \times} \\ 0 \amp 0 \amp 0 \amp \times \amp \times \end{array} \end{equation*}

\(\begin{array}{c} \\ \\ \\ \longrightarrow \end{array}\)

\begin{equation*} \begin{array}{c c c | c c} \times \amp {\color{gray} \times} \amp {\color{gray} 0} \amp {\color{gray} 0} \amp {\color{gray} 0} \\ \times \amp \times \amp {\color{gray} \times} \amp {\color{gray} 0} \amp {\color{gray} 0} \\ 0 \amp \times \amp \times \amp {\color{gray} \times} \amp {\color{gray} 0} \\ \hline 0 \amp 0 \amp \times \amp \times \amp {\color{gray} \times} \\ 0 \amp 0 \amp 0 \amp \times \amp \times \end{array} \end{equation*}

Third iteration

Figure 10.3.1.1. An illustration of the reduction of a Hermitian matrix to tridiagonal form. The \(\times \)s denote nonzero elements in the matrix. The gray entries above the diagonal are not actually updated.

The update of \(A_{22} \) warrants closer scrutiny:

\begin{equation*} \begin{array}{rcl} A_{22} \amp := \amp H( a_{21} ) A_{22} H( a_{21} ) \\ \amp = \amp ( I - \frac{1}{\tau} u_{21} u_{21}^H ) A_{22} ( I - \frac{1}{\tau} u_{21} u_{21}^H ) \\ \amp = \amp ( A_{22} - \frac{1}{\tau} u_{21} \begin{array}[t]{c} \underbrace{ u_{21}^H A_{22}} \\ y_{21}^H \end{array} ) ( I - \frac{1}{\tau} u_{21} u_{21}^H ) \\ \amp = \amp A_{22} - \frac{1}{\tau} u_{21} y_{21}^H - \frac{1}{\tau} \begin{array}[t]{c} \underbrace{ A_{22} u_{21}}\\ y_{21} \end{array} u_{21}^H + \frac{1}{\tau^2} u_{21} \begin{array}[t]{c} \underbrace{ y_{21}^H u_{21} } \\ 2 \beta \end{array} u_{21}^H \\ \amp= \amp A_{22} - \left( \frac{1}{\tau} u_{21} y_{21}^H - \frac{\beta}{\tau^2}u_{21} u_{21}^H \right) - \left( \frac{1}{\tau} y_{21} u_{21}^H - \frac{\beta}{\tau^2}u_{21} u_{21}^H \right) \\ \amp= \amp A_{22} - u_{21} \begin{array}[t]{c} \underbrace{ \frac{1}{\tau} \left( y_{21}^H - \frac{\beta}{\tau} u_{21}^H \right) } \\ w_{21}^H \end{array} - \begin{array}[t]{c} \underbrace{ \frac{1}{\tau} \left( y_{21} - \frac{\beta}{\tau}u_{21} \right) } \\ w_{21} \end{array} u_{21}^H \\ \amp = \amp \begin{array}[t]{c} \underbrace{ A_{22} - u_{21} w_{21}^H - w_{21} u_{21}^H. } \\ \mbox{Hermitian}\\ \mbox{rank-2 update} \end{array} \end{array} \end{equation*}

This formulation has two advantages: it requires fewer computations and it does not generate an intermediate result that is not Hermitian. An algorithm that implements all these insights is given in Figure 10.3.1.2.

\begin{equation*} \newcommand{\routinename}{ \left[ A, t \right] := \mbox{TriRed-unb}( A,t ) } \newcommand{\guard}{ m( A_{TL} ) \lt m( A )-2 } \newcommand{\partitionings}{ A \rightarrow \FlaTwoByTwo{A_{TL}}{A_{TR}} {A_{BL}}{A_{BR}}, t \rightarrow \FlaTwoByOne{ t_{T} }{t_B} } \newcommand{\partitionsizes}{ A_{TL} {\rm ~is~} 0 \times 0 \mbox{ and } t_T \mbox{ has } 0 \mbox{ elements} } \newcommand{\repartitionings}{ \FlaTwoByTwo{A_{TL}}{A_{TR}} {A_{BL}}{A_{BR}} \rightarrow \FlaThreeByThreeBR{A_{00}}{a_{01}}{A_{02}} {a_{10}^T}{\alpha_{11}}{a_{12}^T} {A_{20}}{a_{21}}{A_{22}}, \FlaTwoByOne{t_T}{t_B} \rightarrow \FlaThreeByOneB{t_0}{\tau_1}{t_2} } \newcommand{\moveboundaries}{ \FlaTwoByTwo{A_{TL}}{A_{TR}} {A_{BL}}{A_{BR}} \leftarrow \FlaThreeByThreeTL{A_{00}}{a_{01}}{A_{02}} {a_{10}^T}{\alpha_{11}}{a_{12}^T} {A_{20}}{a_{21}}{A_{22}}, \FlaTwoByOne{t_T}{t_B} \leftarrow \FlaThreeByOneT{t_0}{\tau_1}{t_2} } \newcommand{\update}{ \begin{array}{l} [ a_{21}, \tau_1 ] := {\rm Housev1}( a_{21} ) \\ u_{21} = a_{21} \mbox{ with first element replaced with 1 } \\ \mbox{ Update } A_{22} := H( u_{21} ) A_{22} H( u_{21} ) \mbox{ via the steps } \\ ~~~~~~ \left\{ \begin{array}{ll} y_{21} := A_{22} u_{21} \amp \mbox{(Hermitian matrix-vector multiply!)}\\ \beta := u_{21}^H y_{21} / 2 \\ w_{21} := ( y_{21} - \beta u_{21} / \tau_1 ) / \tau_1 \\ A_{22} := A_{22} - \mbox{tril}(u_{21} w_{21}^H + w_{21} u_{21}^H) \amp \mbox{(Hermitian~rank-2~update)} \end{array} \right. \end{array} } \FlaAlgorithm \end{equation*}

Figure 10.3.1.2. Basic algorithm for reduction of a Hermitian matrix to tridiagonal form.

During the first iteration, when updating \((m-1) \times (m-1) \) matrix \(A_{22} \text{,}\) the bulk of computation is in the computation of \(y_{21} := A_{22} u_{21} \text{,}\) at \(2 (m-1)^2 \) flops, and \(A_{22} := A_{22} - ( u_{21} w_{21}^H + w_{21} u_{21}^H) \text{,}\) at \(2 (m-1)^2 \) flops. The total cost for reducing \(m \times m \) matrix \(A \) to tridiagonal form is therefore approximately

\begin{equation*} \sum_{k=0}^{m-1} 4 (m-k-1)^2 \mbox{ flops}. \end{equation*}

By substituting \(j = m-k-1 \) we find that

\begin{equation*} \sum_{k=0}^{m-1} 4 (m-k-1)^2 \mbox{ flops} = 4 \sum_{j=0}^{m-1} j^2 \mbox{ flops} \approx 4 \int_{0}^m x^2 dx = \frac{4}{3} m^3 \mbox{ flops}. \end{equation*}

This equals, approximately, the cost of one QR factorization of matrix \(A \text{.}\)

Homework 10.3.1.1.

A more straight forward way of updating \(A_{22} \) is given by

\begin{equation*} \begin{array}{rcl} A_{22} \amp := \amp ( I - \frac{1}{\tau} u_{21} u_{21}^H ) A_{22} ( I - \frac{1}{\tau} u_{21} u_{21}^H ) \\ \amp = \amp \begin{array}[t]{c} \underbrace{ ( A_{22} - \frac{1}{\tau} u_{21} \begin{array}[t]{c} \underbrace{ u_{21}^H A_{22}} \\ y_{21}^H \end{array} ) } \\ B_{22} \end{array} ( I - \frac{1}{\tau} u_{21} u_{21}^H ) \\ \amp = \amp B_{22} - \frac{1}{\tau} \begin{array}[t]{c} \underbrace{ B_{22} u_{21} } \\ x_{21} \end{array} u_{21}^H . \end{array} \end{equation*}

This suggests the steps

Compute \(y_{21} = A_{22} u_{21} \text{.}\) (Hermitian matrix-vector multiplication).
Compute \(B_{22} = A_{22}- \frac{1}{\tau} u_{21} y_{21}^H \text{.}\) (Rank-1 update yielding a nonHermitian intermediate matrix).
Compute \(x_{21} = B_{22} u_{21} \text{.}\) (Matrix-vector multiplication).
Compute \(A_{22} = B_{22}- \frac{1}{\tau} x_{21} u_{21}^H \text{.}\) (Rank-1 update yielding a Hermitian final matrix).

Estimate the cost of this alternative approach. What other disadvantage(s) does this approach have?

Solution

During the \(k \)th iteration, for \(k = 0, 1, \ldots, m-1 \) the costs for the various steps are as follows:

Compute \(y_{21} = A_{22} u_{21} \text{.}\) (Hermitian matrix-vector multiplication). Cost: approximately \(2 (m-1)^2 \) flops.
Compute \(B_{22} = A_{22}- \frac{1}{\tau} u_{21} y_{21}^H \text{.}\) (Rank-1 update yielding a nonHermitian intermediate matrix). Cost: approximately \(2 (m-1)^2 \) flops since the intermediate matrix \(B_{22} \) is not Hermitian.
Compute \(x_{21} = B_{22} u_{21} \text{.}\) (Matrix-vector multiplication). Cost: approximately \(2 (m-1)^2 \) flops.
Compute \(A_{22} = B_{22}- \frac{1}{\tau} x_{21} y_{21}^H \text{.}\) Only the lower triangular part of \(A_{22} \) needs to be computed. Cost: approximately \((m-1)^2 \) flops.

Thus, the total cost per iteration is, approximately

\begin{equation*} 7 (m-1)^2 \mbox{ flops}. \end{equation*}

The total cost is then, approximately,

\begin{equation*} \sum_{k=0}^{m-1} 7 (m-k-1)^2 \mbox{ flops} = 7 \sum_{j=0}^{m-1} j^2 \mbox{ flops} \approx 7 \int_{0}^m x^2 dx = \frac{7}{3} m^3 \mbox{ flops}. \end{equation*}

This almost doubles the cost of the reduction to tridiagonal form.

An additional disadvantage is that a nonsquare intermediate matrix must be stored.

The diagonal elements of a Hermitian matrix are real. Hence the tridiagonal matrix has real values on its diagonal. A post process (that follows the reduction to tridiagonal form) can be used to convert the elements of the subdiagonal to real values as well. The advantage of this is that in the subsequent computation, that computes the eigenvalues of the tridiagonal matrix and accumulates the eigenvectors, only needs to perform real (floating point) arithmetic.

Ponder This 10.3.1.2.

Propose a postprocess that converts the off-diagonal elements of a tridiagonal Hermitian matrix to real values. The postprocess must be equivalent to applying a unitary similarity transformation so that eigenvalues are preserved.

n You may want to start by looking at

\begin{equation*} A = \left( \begin{array}{c c} \alpha_{0,0} \amp \overline \alpha_{1,0} \\ \alpha_{1,0} \amp \alpha_{1,1} \end{array} \right), \end{equation*}

where the diagonal elements are real-valued and the off-diagonal elements are complex-valued. Then move on to

\begin{equation*} A = \left( \begin{array}{c c c} \alpha_{0,0} \amp \overline \alpha_{1,0} \amp 0\\ \alpha_{1,0} \amp \alpha_{1,1} \amp \overline \alpha_{2,1} \\ 0 \amp \alpha_{2,1} \amp \alpha_{2,2} \\ \end{array} \right). \end{equation*}

What is the pattern?

Homework 10.3.1.3.

You may want to start by executing git pull to update your directory Assignments.

In directory Assignments/Week10/matlab/, you will find the following files:

Housev1.m: An implementation of the function Housev1, mentioned in the unit.
TriRed.m: A code skeleton for a function that reduces a Hermitian matrix to a tridiagonal matrix. Only the lower triangular part of the input and output are stored.
```
[ T, t ] = TriRed( A, t )
```
returns the diagonal and first subdiagonal of the tridiagonal matrix in T, stores the Householder vectors below the first subdiagonal, and returns the scalars \(\tau \) in vector t.
TriFromBi.m: A function that takes the diagonal and first subdiagonal in the input matrix and returns the tridiagonal matrix that they define.
```
T = TriFromBi( A )
```
test_TriRed.m: A script that tests TriRed.

With these resources, you are to complete TriRed by implementing the algorithm in Figure 10.3.1.2.

Be sure to look at the hint!

Hint

If A holds Hermitian matrix \(A \text{,}\) storing only the lower triangular part, then \(A x \) is implemented in Matlab as

( tril( A ) + tril( A, -1)' ) * x;

Updating only the lower triangular part of array A with \(A := A - B \) is accomplished by

A = A - tril( B );

Solution

Assignments/Week10/answers/TriRed.m.