ALAFF Banded matrices

Subsection 7.2.1 Banded matrices

It is tempting to simply use a dense linear solver to compute the solution to \(A x = b \) via, for example, LU or Cholesky factorization, even when \(A \) is sparse. This would require \(O( n^3 ) \) operations, where \(n \) equals the size of matrix \(A \text{.}\) What we see in this unit is that we can take advantage of a "banded" structure in the matrix to greatly reduce the computational cost.

Homework 7.2.1.1.

The 1D equivalent of the example from Subsection 7.1.1 is given by the tridiagonal linear system

\begin{equation} A = \left( \begin{array}{r r r r r r} 2 \amp -1 \amp \amp \amp \\ -1 \amp 2 \amp -1 \amp \amp \\ \amp \ddots \amp \ddots \amp \ddots \amp \\ \amp \amp -1 \amp 2 \amp -1 \\ \amp \amp \amp -1 \amp 2 \\ \end{array} \right). \label{chapter07-banded-eq1}\tag{7.2.1} \end{equation}

Prove that this linear system is nonsingular.

Hint

Consider \(A x = 0 \text{.}\) We need to prove that \(x = 0 \text{.}\) If you instead consider the equivalent problem

\begin{equation*} \left( \begin{array}{c | c | c} 1 \amp 0 \amp 0 \\ \hline \begin{array}{c} -1 \\ 0 \\ \vdots \\ 0 \\ 0 \end{array} \amp \left( \begin{array}{c c c c c c c} 2 \amp -1 \amp \amp \cdots \amp 0 \\ -1 \amp 2 \amp -1 \amp \cdots \amp 0 \\ \amp \ddots \amp \ddots \amp \ddots \amp \vdots \\ \amp \amp -1 \amp 2 \amp -1 \\ \amp \amp \amp -1 \amp 2 \\ \end{array} \right) \amp \begin{array}{c} 0 \\ 0 \\ \vdots \\ 0 \\ -1 \end{array} \\ \hline 0 \amp 0 \amp 1 \end{array} \right) \left( \begin{array}{c} \chi_{-1} \\ \hline \begin{array}{c} \chi_0 \\ \chi_1 \\ \vdots \\ \chi_{n-2} \\ \chi_{n-1} \end{array} \\ \hline \chi_n \end{array} \right) = \left( \begin{array}{c} 0 \\ \hline \begin{array}{c} 0 \\ 0 \\ \vdots \\ 0 \\ 0 \end{array} \\ \hline 0 \end{array} \right) \end{equation*}

that introduces two extra variables \(\chi_{-1} = 0 \) and \(\chi_n = 0 \text{,}\) the problem for all \(\chi_i \text{,}\) \(0 \leq i \lt n \text{,}\) becomes

\begin{equation*} - \chi_{i-1} + 2 \chi_i - \chi_{i+1} = 0. \end{equation*}

or, equivalently,

\begin{equation*} \chi_i = \frac{\chi_{i-1} + \chi_{i+1}}{2}. \end{equation*}

Reason through what would happen if any \(\chi_i \) is not equal to zero.

Solution

Building on the hint: Let's say that \(\chi_i \neq 0 \) while \(\chi_{-1}, \ldots , \chi_{i-1} \) are. Then

\begin{equation*} \chi_{i} = \frac{\chi_{i-1} + \chi_{i+1}}{2} = \frac{1}{2} \chi_{i+1} \end{equation*}

and hence

\begin{equation*} \chi_{i+1} = 2 \chi_i \gt 0. \text{.} \end{equation*}

Next,

\begin{equation*} \chi_{i+1} = \frac{\chi_{i} + \chi_{i+2}}{2} = 2 \chi_{i} \end{equation*}

and hence

\begin{equation*} \chi_{i+2} = 4 \chi_i - \chi_i = 3 \chi_i \gt 0. \end{equation*}

Continuing this argument, the solution to the recurrence relation is \(\chi_n = ( n-i + 1) \chi_i \) and you find that \(\chi_n \gt 0\) which is a contradiction.

This course covers topics in a "circular" way, where sometimes we introduce and use results that we won't formally cover until later in the course. Here is one such situation. In a later week you will prove these relevant results involving eigenvalues:

A symmetric matrix is symmetric positive definite (SPD) if and only if its eigenvalues are positive.
The Gershgorin Disk Theorem tells us that the matrix in (7.2.1) has nonnegative eigenvalues.
A matrix is singular if and only if it has zero as an eigenvalue.

These insights, together with Homework 7.2.1.1, tell us that the matrix in (7.2.1) is SPD.

Homework 7.2.1.2.

Compute the Cholesky factor of

\begin{equation*} A = \left( \begin{array}{r r r r} 4 \amp -2 \amp 0 \amp 0 \\ -2 \amp 5 \amp -2 \amp 0 \\ 0 \amp -2 \amp 10 \amp 6 \\ 0 \amp 0 \amp 6 \amp 5 \\ \end{array} \right). \end{equation*}

Answer

\begin{equation*} L = \left( \begin{array}{r r r r} 2 \amp 0 \amp 0 \amp 0\\ -1 \amp 2 \amp 0 \amp 0 \\ 0 \amp -1 \amp 3 \amp 0 \\ 0 \amp 0 \amp 2 \amp 1 \end{array} \right). \end{equation*}

Homework 7.2.1.3.

Let \(A \in \mathbb R^{n \times n} \) be tridiagonal and SPD so that

\begin{equation} A = \left( \begin{array}{c c c c c c c} \alpha_{0,0} \amp \alpha_{1,0} \amp \amp \amp \\ \alpha_{1,0} \amp \alpha_{1,1} \amp \alpha_{2,1} \amp \amp \\ \amp \ddots \amp \ddots \amp \ddots \amp \\ \amp \amp \alpha_{n-2,n-3} \amp \alpha_{n-2,n-2} \amp \alpha_{n-1,n-2} \\ \amp \amp \amp \alpha_{n-1,n-2} \amp \alpha_{n-1,n-1} \end{array} \right). \label{chapter07-banded-eq2}\tag{7.2.2} \end{equation}

Propose a Cholesky factorization algorithm that exploits the structure of this matrix.
What is the cost? (Count square roots, divides, multiplies, and subtractions.)
What would have been the (approximate) cost if we had not taken advantage of the tridiagonal structure?

Solution

If you play with a few smaller examples, you can conjecture that the Cholesky factor of (7.2.2) is a bidiagonal matrix (the main diagonal plus the first subdiagonal). Thus, \(A = L L^T \) translates to

\begin{equation*} \begin{array}{l} \left( \begin{array}{c c c c c c c} \alpha_{0,0} \amp \alpha_{1,0} \amp \amp \amp \\ \alpha_{1,0} \amp \alpha_{1,1} \amp \alpha_{2,1} \amp \amp \\ \amp \ddots \amp \ddots \amp \ddots \amp \\ \amp \amp \alpha_{n-2,n-3} \amp \alpha_{n-2,n-2} \amp \alpha_{n-1,n-2} \\ \amp \amp \amp \alpha_{n-1,n-2} \amp \alpha_{n-1,n-1} \end{array} \right) \\ = \left( \begin{array}{c c c c c c c} \lambda_{0,0} \amp \amp \amp \amp \\ \lambda_{1,0} \amp \lambda_{1,1} \amp \amp \amp \\ \amp \ddots \amp \ddots \amp \amp \\ \amp \amp \lambda_{n-2,n-3} \amp \lambda_{n-2,n-2} \amp \\ \amp \amp \amp \lambda_{n-1,n-2} \amp \lambda_{n-1,n-1} \end{array} \right) \left( \begin{array}{c c c c c c c} \lambda_{0,0} \amp \lambda_{1,0} \amp \amp \amp \\ \amp \lambda_{1,1} \amp \lambda_{2,1} \amp \amp \\ \amp \amp \ddots \amp \ddots \amp \\ \amp \amp \amp \lambda_{n-2,n-2} \amp \lambda_{n-1,n-2} \\ \amp \amp \amp \amp \lambda_{n-1,n-1} \end{array} \right) \\ = \left( \begin{array}{c c c c c c c} \lambda_{0,0} \lambda_{0,0} \amp \lambda_{0,0} \lambda_{1,0} \amp \amp \amp \\ \lambda_{1,0} \lambda_{0,0} \amp \lambda_{1,0} \lambda_{0,1} + \lambda_{1,1} \lambda_{1,1} \amp \lambda_{1,1} \lambda_{2,1} \amp \amp \\ \amp \lambda_{21} \lambda_{11} \amp \ddots \amp \ddots \amp \\ \amp \amp \ddots \amp \star \star \amp \lambda_{n-2,n-2} \lambda_{n-1,n-2} \\ \amp \amp \amp \lambda_{n-1,n-2} \lambda_{n-2,n-2} \amp \begin{array}[t]{l} \lambda_{n-1,n-2} \lambda_{n-1,n-2} \\ ~~~~~~~+ \lambda_{n-1,n-1} \lambda_{n-1,n-1} \end{array} \end{array} \right) , \end{array} \end{equation*}

where \(\star\star = \lambda_{n-3,n-2} \lambda_{n-3,n-2} + \lambda_{n-2,n-2} \lambda_{n-2,n-2} \text{.}\) With this insight, the algorithm that overwrites \(A \) with its Cholesky factor is given by

\begin{equation*} \begin{array}{l} {\bf for~} i = 0, \ldots, n-2 \\ ~~~ \alpha_{i,i} := \sqrt{ \alpha_{i,i} } \\ ~~~ \alpha_{i+1,i} := \alpha_{i+1,i} / \alpha_{i,i} \\ ~~~ \alpha_{i+1,i+1} := \alpha_{i+1,i+1} - \alpha_{i+1,i} \alpha_{i+1,i} \\ {\bf endfor} \\ \alpha_{n-1,n-1} := \sqrt{ \alpha_{n-1,n-1} } \\ \end{array} \end{equation*}
A cost analysis shows that this requires \(n \) square roots, \(n-1 \) divides, \(n-1 \) multiplies, and \(n-1 \) subtracts.
The cost, had we not taken advantage of the special structure, would have been (approximately) \(\frac{1}{3} n^3 \text{.}\)

Homework 7.2.1.4.

Propose an algorithm for overwriting \(y \) with the solution to \(A x = y \) for the SPD matrix in Homework 7.2.1.3.

Solution

Use the algorithm from Homework 7.2.1.3 to overwrite \(A \) with its Cholesky factor.
Since \(A = L L^T \text{,}\) we need to solve \(L z = y \) and then \(L^T x = z \text{.}\)
- Overwriting \(y \) with the solution of \(L z = y \) (forward substitution) is accomplished by the following algorithm (here \(L \) had overwritten \(A \)):
  
  \begin{equation*} \begin{array}{l} {\bf for~} i = 0, \ldots, n-2 \\ ~~~ \psi_i := \psi_i / \alpha_{i,i} \\ ~~~ \psi_{i+1} := \psi_{i+1} - \alpha_{i+1,i} \psi_i \\ {\bf endfor} \\ \psi_{n-1} := \psi_{n-1} / \alpha_{n-1,n-1} \end{array} \end{equation*}
- Overwriting \(y \) with the solution of \(L x = z \) (where \(z \) has overwritten \(y \) (back substitution) is accomplished by the following algorithm (here \(L \) had overwritten \(A \)):
  
  \begin{equation*} \begin{array}{l} {\bf for~} n-1 = 0, \ldots, 1 \\ ~~~ \psi_i := \psi_i / \alpha_{i,i} \\ ~~~ \psi_{i-1} := \psi_{i-1} - \alpha_{i,i-1} \psi_i \\ {\bf endfor} \\ \psi_{0} := \psi_{0} / \alpha_{0,0} \end{array} \end{equation*}

The last exercises illustrate how special structure (in terms of patterns of zeroes and nonzeroes) can often be exploited to reduce the cost of factoring a matrix and solving a linear system.

The bandwidth of a matrix is defined as the smallest integer \(b \) such that all elements on the \(j \)th superdiagonal and subdiagonal of the matrix equal zero if \(j \gt b \text{.}\)

A diagonal matrix has bandwidth \(1 \text{.}\)
A tridiagonal matrix has bandwith \(2 \text{.}\)
And so forth.

Let's see how to take advantage of the zeroes in a matrix with bandwidth \(b \text{,}\) focusing on SPD matrices.

Definition 7.2.1.1.

The half-band width of a symmetric matrix equals the number of subdiagonals beyond which all the matrix contains only zeroes. For example, a diagonal matrix has half-band width of zero and a tridiagonal matrix has a half-band width of one.

Homework 7.2.1.5.

Assume the SPD matrix \(A \in \mathbb R^{m \times m}\) has a bandwidth of \(b \text{.}\) Propose a modification of the right-looking Cholesky factorization from Figure 5.4.3.1

\begin{equation*} \newcommand{\routinename}{A = \mbox{Chol-right-looking}( A )} \newcommand{\guard}{ n( A_{TL} ) \lt n( A ) } \newcommand{\partitionings}{ A \rightarrow \FlaTwoByTwo{A_{TL}}{A_{TR}} {A_{BL}}{A_{BR}} } \newcommand{\partitionsizes}{ A_{TL} {\rm ~is~} 0 \times 0 } \newcommand{\repartitionings}{ \FlaTwoByTwo{A_{TL}}{A_{TR}} {A_{BL}}{A_{BR}} \rightarrow \FlaThreeByThreeBR{A_{00}}{a_{01}}{A_{02}} {a_{10}^T}{\alpha_{11}}{a_{12}^T} {A_{20}}{a_{21}}{A_{22}} } \newcommand{\moveboundaries}{ \FlaTwoByTwo{A_{TL}}{A_{TR}} {A_{BL}}{A_{BR}} \leftarrow \FlaThreeByThreeTL{A_{00}}{a_{01}}{A_{02}} {a_{10}^T}{\alpha_{11}}{a_{12}^T} {A_{20}}{a_{21}}{A_{22}} } \newcommand{\update}{ \begin{array}{l} \alpha_{11} := \sqrt{ \alpha_{11} } \\ a_{21} := a_{21} / \alpha_{11} \\ A_{22} := A_{22} - a_{21} a_{21}^T ~~~~~\mbox{ (updating only the lower triangular part)} \end{array} } \FlaAlgorithm \end{equation*}

that takes advantage of the zeroes in the matrix. (You will want to draw yourself a picture.) What is its approximate cost in flops (when \(m \) is large)?

Solution

See the below video.

Ponder This 7.2.1.6.

Propose a modification of the FLAME notation that allows one to elegantly express the algorithm you proposed for Homework 7.2.1.5

Ponder This 7.2.1.7.

Another way of looking at an SPD matrix \(A \in \Rnxn \) with bandwidth \(b \) is to block it

\begin{equation*} A = \left( \begin{array}{c c c c c c c} A_{0,0} \amp A_{1,0}^T \amp \amp \amp \\ A_{1,0} \amp A_{1,1} \amp A_{2,1}^T \amp \amp \\ \amp \ddots \amp \ddots \amp \ddots \amp \\ \amp \amp A_{n-2,n-3} \amp A_{n-2,n-2} \amp A_{n-1,n-2}^T \\ \amp \amp \amp A_{n-1,n-2} \amp A_{n-1,n-1} \end{array} \right). \end{equation*}

where, \(A_{i,j} \in \R^{b \times b} \) and for simplicity we assume that \(n \) is a multiple of \(b \text{.}\) Propose an algorithm for computing its Cholesky factorization that exploits this block structure. What special structure do matrices \(A_{i+1,i} \) have? Can you take advantage of this structure?

Analyze the cost of your proposed algorithm.