## Subsection3.2.4Modified Gram-Schmidt (MGS)

In the video, we reasoned that the following two algorithms compute the same values, except that the columns of $Q$ overwrite the corresponding columns of $A \text{:}$

$\begin{array}{l} {\bf for~} j=0, \ldots, n-1 \\ ~~~ a_j^\perp := a_j \\ ~~~ {\bf for~} k=0, \dots, j-1 \\ ~~~ ~~~ \rho_{k,j} := q_k^H a_j^\perp \\ ~~~ ~~~ a_j^\perp := a_j^\perp - \rho_{k,j} q_k \\ ~~~ {\bf end} \\ ~~~ \rho_{j,j} := \| {a}_j^\perp \|_2 \\ ~~~ q_j := a_j^\perp / \rho_{j,j} \\ {\bf end} \end{array}$

$~$

$\begin{array}{l} {\bf for~} j=0, \ldots, n-1 \\ ~~\\ ~~~ {\bf for~} k=0, \dots, j-1 \\ ~~~ ~~~ \rho_{k,j} := a_k^H a_j \\ ~~~ ~~~ a_j := a_j - \rho_{k,j} a_k \\ ~~~ {\bf end} \\ ~~~ \rho_{j,j} := \| {a}_j \|_2 \\ ~~~ a_j := a_j / \rho_{j,j} \\ {\bf end} \end{array}$

(a) MGS algorithm that computes $Q$ and $R$ from $A \text{.}$

$~$

(b) MGS algorithm that computes $Q$ and $R$ from $A \text{,}$ overwriting $A$ with $Q\text{.}$

###### Homework3.2.4.1.

Assume that $q_0, \ldots, q_{k-1}$ are mutually orthonormal. Let $\rho_{j,k} = q_j^H y$ for $j = 0, \ldots, i-1 \text{.}$ Show that

\begin{equation*} \begin{array}[t]{c} \underbrace{ q_i^H y } \\ \rho_{i,k} \end{array} = q_i^H \left( y - \rho_{0,k} q_0 - \cdots - \rho_{i-1,k} q_{i-1} \right) \end{equation*}

for $i = 0, \ldots, k-1 \text{.}$

Solution
\begin{equation*} \begin{array}{l} q_i^H \left( y - \rho_{0,k} q_0 - \cdots - \rho_{i-1,k} q_{i-1} \right) \\ ~~~=~~~~ \lt \mbox{ distribute } \gt \\ q_i^H y - q_i^H \rho_{0,k} q_0 - \cdots - q_i^H \rho_{i-1,k} q_{i-1} \\ ~~~=~~~~ \lt \rho_{0,k} \mbox{ is a scalar } \gt \\ q_i^H y - \rho_{0,k} \begin{array}[t]{c} \underbrace{ q_i^H q_0} \\ 0 \end{array} - \cdots - \begin{array}[t]{c} \underbrace{ \rho_{i-1,k} q_i^H q_{i-1} } \\ 0 \end{array} \end{array} \end{equation*}

This homework illustrates how, given a vector $y \in \Cm$ and a matrix $Q \in \C^{mxk}$ the component orthogonal to the column space of $Q \text{,}$ given by $( I - Q Q^H ) y \text{,}$ can be computed by either of the two algorithms given in Figure 3.2.4.1. The one on the left, $\mbox{Proj}\perp Q_{\rm CGS}( Q, y )$ projects $y$ onto the column space perpendicular to $Q$ as did the Gram-Schmidt algorithm with which we started. The one on the left successfuly subtracts out the component in the direction of $q_i$ using a vector that has been updated in previous iterations (and hence is already orthogonal to $q_0, \ldots, q_{i-1} \text{.}$ The algorithm on the right is one variant of the Modified Gram-Schmidt (MGS) algorithm.

These insights allow us to present CGS and this variant of MGS in FLAME notation, in Figure 3.2.4.2 (left and middle).

Next, we massage the MGS algorithm into the alternative MGS algorithmic variant given in Figure 3.2.4.2 (right).

The video discusses how MGS can be rearranged so that every time a new vector $q_k$ is computed (overwriting $a_k$), the remaining vectors, $\{ a_{k+1}, \ldots, a_{n-1} \} \text{,}$ can be updated by subtracting out the component in the direction of $q_k \text{.}$ This is also illustrated through the next sequence of equivalent algorithms.

This discussion shows that the updating of future columns by subtracting out the component in the direction of the latest column of $Q$ to be computed can be cast in terms of a rank-1 update. This is also captured, using FLAME notation, in the algorithm in Figure 3.2.4.2, as is further illustrated in Figure 3.2.4.4: Figure 3.2.4.4. Alternative Modified Gram-Schmidt algorithm for computing the QR factorization of a matrix $A \text{.}$

###### Ponder This3.2.4.2.

Let $A$ have linearly independent columns and let $A = Q R$ be a QR factorization of $A \text{.}$ Partition

\begin{equation*} A \rightarrow \FlaOneByTwo{A_L}{A_R}, \quad Q \rightarrow \FlaOneByTwo{Q_L}{Q_R}, {\rm ~~~and~~~} R \rightarrow \FlaTwoByTwo {R_{TL}}{R_{TR}} { 0 }{R_{BR}}, \end{equation*}

where $A_L$ and $Q_L$ have $k$ columns and $R_{TL}$ is $k \times k \text{.}$

As you prove the following insights, relate each to the algorithm in Figure 3.2.4.4. In particular, at the top of the loop of a typical iteration, how have the different parts of $A$ and $R$ been updated?

1. $A_L = Q_L R_{TL}\text{.}$

($Q_L R_{TL}$ equals the QR factorization of $A_L \text{.}$)

2. ${\cal C}( A_L ) = {\cal C}( Q_L ) \text{.}$

(The first $k$ columns of $Q$ form an orthonormal basis for the space spanned by the first $k$ columns of $A \text{.}$)

3. $R_{TR} = Q_L^H A_R\text{.}$

4. $(A_R - Q_L R_{TR} )^H Q_L = 0 \text{.}$

(Each column in $A_R - Q_L R_{TR}$ equals the component of the corresponding column of $A_R$ that is orthogonal to $\Span( Q_L ) \text{.}$)

5. ${\cal C}( A_R - Q_L R_{TR} ) = {\cal C}( Q_R ) \text{.}$

6. $A_R - Q_L R_{TR} = Q_R R_{BR} \text{.}$

(The columns of $Q_R$ form an orthonormal basis for the column space of $A_R - Q_L R_{TR} \text{.}$)

Solution

Consider the fact that $A = Q R \text{.}$ Then, multiplying the partitioned matrices,

\begin{equation*} \begin{array}{rcl} \FlaOneByTwo{A_L}{A_R} \amp = \amp \FlaOneByTwo{Q_L}{Q_R} \FlaTwoByTwo {R_{TL}}{R_{TR}} { 0 }{R_{BR}} \\ \amp = \amp \FlaOneByTwo{Q_L R_{TL}}{Q_L R_{TR} + Q_R R_{BR}}. \end{array} \end{equation*}

Hence

\begin{equation} A_L = Q_L R_{TL} \quad \mbox{and} \quad A_R = Q_L R_{TR} + Q_R R_{BR}.\label{MGS-eqn-1}\tag{3.2.2} \end{equation}
1. The left equality in (3.2.2) answers 1.

2. ${\cal C}( A_L ) = {\cal C}( Q_L )$ can be shown by noting that $R$ is upper triangular and nonsingular and hence $R_{TL}$ is upper triangular and nonsingular, and using this to show that ${\cal C}( A_L ) \subset {\cal C}( Q_L )$ and ${\cal C}( Q_L ) \subset {\cal C}( A_L ) \text{:}$

• ${\cal C}( A_L ) \subset {\cal C}( Q_L ) \text{:}$ Let $y \in {\cal C}( A_L ) \text{.}$ Then there exists $x$ such that $A_L x = y \text{.}$ But then $Q_L R_{TL} x = y$ and hence $Q_L( R_{TL} x ) = y$ which means that $y \in {\cal C}( Q_L ) \text{.}$

• ${\cal C}( Q_L ) \subset {\cal C}( A_L ) \text{:}$ Let $y \in {\cal C}( Q_L ) \text{.}$ Then there exists $x$ such that $Q_L x = y \text{.}$ But then $A_L R_{TL}^{-1} x = y$ and hence $A_L( R_{TL}^{-1} x ) = y$ which means that $y \in {\cal C}( A_L ) \text{.}$

3. Take $A_R - Q_L R_{TR} = Q_R R_{BR}$ and multiply both side by $Q_L^H \text{:}$

\begin{equation*} Q_L^H( A_R - Q_L R_{TR} )= Q_L^H Q_R R_{BR} \end{equation*}

is equivalent to

\begin{equation*} Q_L^H A_R - \begin{array}[t]{c} \underbrace{ Q_L^H Q_L } \\ I \end{array} R_{TR} = \begin{array}[t]{c} \underbrace{ Q_L^H Q_R } \\ 0 \end{array} R_{BR} = 0. \end{equation*}

Rearranging yields 3.

4. Since $A_R - Q_L R_{TR} = Q_R R_{BR}$ we find that $( A_R - Q_L R_{TR} )^H Q_L = ( Q_R R_{BR} )^H Q_L$ and

\begin{equation*} ( A_R - Q_L R_{TR} )^H Q_L = R_{BR}^H Q_R^H Q_L = 0. \end{equation*}
5. Similar to the proof of 2.

6. Rearranging the right equality in (3.2.2) yields $A_R - Q_L R_{TR} = Q_R R_{BR}. \text{,}$ which answers 5.

7. Letting $\widehat A$ denote the original contents of $A \text{,}$ at a typical point,

• $A_L$ has been updated with $Q_L \text{.}$

• $R_{TL}$ and $R_{TR}$ have been computed.

• $A_R = \widehat A_R - Q_L R_{TR} \text{.}$

###### Homework3.2.4.3.

Implement the algorithm in Figure 3.2.4.4 as

function [ Aout, Rout ] = MGS_QR( A, R )

Input is an $m \times n$ matrix A and a $n \times n$ matrix $R \text{.}$ Output is the matrix $Q \text{,}$ which has overwritten matrix A, and the upper triangular matrix $R \text{.}$ (The values below the diagonal can be arbitrary.) You may want to use Assignments/Week03/matlab/test_MGS_QR.m to check your implementation.