ALAFF Classical Gram-Schmidt (CGS)

Subsection 3.2.1 Classical Gram-Schmidt (CGS)

Given a set of linearly independent vectors \(\{ a_0, \ldots, a_{n-1} \} \subset \Cm \text{,}\) the Gram-Schmidt process computes an orthonormal basis \(\{ q_0, \ldots, q_{n-1} \} \) that spans the same subspace as the original vectors, i.e.

\begin{equation*} \Span ( \{ a_0, \ldots, a_{n-1} \} ) = \Span ( \{ q_0, \ldots, q_{n-1} \} ) . \end{equation*}

The process proceeds as follows:

Compute vector \(q_0 \) of unit length so that \(\Span( \{ a_0 \} ) = \Span( \{ q_0 \} ) \text{:}\)
- \(\rho_{0,0} = \| a_0 \|_2 \)
  
  Computes the length of vector \(a_0 \text{.}\)
- \(q_0 = a_0 / \rho_{0,0} \)
  
  Sets \(q_0 \) to a unit vector in the direction of \(a_0 \text{.}\)
Notice that \(a_0 = q_0 \rho_{0,0} \)
Compute vector \(q_1 \) of unit length so that \(\Span( \{ a_0, a_1 \} ) = \Span( \{ q_0, q_1 \} ) \text{:}\)
- \(\rho_{0,1} = q_0^H a_1 \)
  
  Computes \(\rho_{0,1} \) so that \(\rho_{0,1} q_0 = q_0^H a_1 q_0 \) equals the component of \(a_1 \) in the direction of \(q_0 \text{.}\)
- \(a_1^\perp = a_1 - \rho_{0,1} q_0 \)
  
  Computes the component of \(a_1 \) that is orthogonal to \(q_0 \text{.}\)
- \(\rho_{1,1} = \| a_1^\perp \|_2 \)
  
  Computes the length of vector \(a_1^\perp \text{.}\)
- \(q_1 = a_1^\perp / \rho_{1,1} \)
  
  Sets \(q_1 \) to a unit vector in the direction of \(a_1^\perp \text{.}\)
Notice that

\begin{equation*} \left( \begin{array}{c | c} a_0 \amp a_1 \end{array} \right) = \left( \begin{array}{c | c} q_0 \amp q_1 \end{array} \right) \left( \begin{array}{c | c} \rho_{0,0} \amp \rho_{0,1} \\ \hline 0 \amp \rho_{1,1} \end{array} \right) . \end{equation*}
Compute vector \(q_2 \) of unit length so that \(\Span( \{ a_0, a_1, a_2 \} ) = \Span( \{ q_0, q_1, q_2 \} ) \text{:}\)
- \(\begin{array}{l} \rho_{0,2} = q_0^H a_2 \\ \rho_{1,2} = q_1^H a_2 \end{array} \mbox{ or, equivalently, } \left( \begin{array}{c} \rho_{0,2} \\ \rho_{1,2} \end{array} \right) = \left( \begin{array}{c c} q_0 \amp q_1 \end{array} \right)^H a_2\)
  
  Computes \(\rho_{0,2} \) so that \(\rho_{0,2} q_0 = q_0^H a_2 q_0 \) and \(\rho_{1,2} q_1 = q_1^H a_2 q_1 \) equal the components of \(a_2 \) in the directions of \(q_0 \) and \(q_1 \text{.}\)
  
  Or, equivalently, \(\left( \begin{array}{c c} q_0 \amp q_1 \end{array} \right) \left( \begin{array}{c} \rho_{0,2} \\ \rho_{1,2} \end{array} \right) \) is the component in \(\Span( \{ q_0, q_1 \} ) \text{.}\)
- \(a_2^\perp = a_2 - \rho_{0,2} q_0 - \rho_{1,2} q_1 = a_2 - \left( \begin{array}{c c} q_0 \amp q_1 \end{array} \right) \left( \begin{array}{c} \rho_{0,2} \\ \rho_{1,2} \end{array} \right) \)
  
  Computes the component of \(a_2 \) that is orthogonal to \(q_0 \) and \(q_1 \text{.}\)
- \(\rho_{2,2} = \| a_2^\perp \|_2 \)
  
  Computes the length of vector \(a_2^\perp \text{.}\)
- \(q_2 = a_2^\perp / \rho_{2,2} \)
  
  Sets \(q_2 \) to a unit vector in the direction of \(a_2^\perp \text{.}\)
Notice that

\begin{equation*} \left( \begin{array}{c c | c} a_0 \amp a_1 \amp a_2 \end{array} \right) = \left( \begin{array}{c c | c} q_0 \amp q_1 \amp q_2 \end{array} \right) \left( \begin{array}{c c | c} \rho_{0,0} \amp \rho_{0,1} \amp \rho_{0,2} \\ 0 \amp \rho_{1,1} \amp \rho_{1,2} \\ \hline 0 \amp 0 \amp \rho_{2,2} \end{array} \right) . \end{equation*}
And so forth.

Yet another way of looking at this problem is as follows.

Consider the matrices

\begin{equation*} A = \left( \begin{array}{c | c | c | c | c | c | c} a_0 \amp \cdots \amp a_{k-1} \amp a_k \amp a_{k+1} \amp \cdots \amp a_{n-1} \end{array} \right) \end{equation*}

and

\begin{equation*} Q = \left( \begin{array}{c | c | c | c | c | c | c} q_0 \amp \cdots \amp q_{k-1} \amp q_k \amp q_{k+1} \amp \cdots \amp q_{n-1} \end{array} \right) \end{equation*}

We observe that

\(\Span( \{a_0\} ) = \Span( \{ q_0 \} ) \)

Hence \(a_0 = \rho_{0,0} q_0 \) for some scalar \(\rho_{0,0} \text{.}\)
\(\Span( \{a_0, a_1\} ) = \Span( \{ q_0, q_1 \} ) \)

Hence

\begin{equation*} a_1 = \rho_{0,1} q_0 + \rho_{1,1} q_1 \end{equation*}

for some scalars \(\rho_{0,1}, \rho_{1,1} \text{.}\)
In general, \(\Span( \{a_0, \ldots, a_{k-1}, a_{k}\} ) = \Span( \{ q_0, \ldots, q_{k-1}, q_k \} ) \)

Hence

\begin{equation*} a_k = \rho_{0,k} q_0 + \cdots + \rho_{k-1,k} q_{k-1} + \rho_{k,k} q_k \end{equation*}

for some scalars \(\rho_{0,k}, \cdots, \rho_{k,k} \text{.}\)

Let's assume that \(q_0, \ldots, q_{k-1} \) have already been computed and are mutually orthonormal. Consider

\begin{equation*} a_k = \rho_{0,k} q_0 + \cdots + \rho_{k-1,k} q_{k-1} + \rho_{k,k} q_k. \end{equation*}

Notice that

\begin{equation*} \begin{array}{rcl} q_k^H a_k \amp = \amp q_k^H \left( \rho_{0,k} q_0 + \cdots + \rho_{k-1,k} q_{k-1} + \rho_{k,k} q_k \right) \\ \amp = \amp \rho_{0,k} \begin{array}[t]{c} \underbrace{ q_k^H q_0 } \\ 0 \end{array} + \cdots + \rho_{k-1,k} \begin{array}[t]{c} \underbrace{ q_k^H q_{k-1} } \\ 0 \end{array} + \rho_{k,k} \begin{array}[t]{c} \underbrace{ q_k^H q_k } \\ 1 \end{array} \end{array} \end{equation*}

so that

\begin{equation*} \rho_{i,k} = q_i^H a_k, \end{equation*}

for \(i = 0, \ldots , k-1 \text{.}\) Next, we can compute

\begin{equation*} a_k^\perp = a_k - \rho_{0,k} q_0 - \cdots - \rho_{k-1,k} q_{k-1} \end{equation*}

and, since \(\rho_{k,k} q_k = a_k^\perp\text{,}\) we can choose

\begin{equation*} \rho_{k,k} = \| a_k^\perp \|_2 \end{equation*}

and

\begin{equation*} q_k = a_k^\perp / \rho_{k,k} \end{equation*}

Remark 3.2.1.1.

For a review of Gram-Schmidt orthogonalization and exercises orthogonalizing real-valued vectors, you may want to look at Linear Algebra: Foundations to Frontiers (LAFF) [27] Week 11.