Skip to main content

Subsection 3.5.2 Summary

Classical Gram-Schmidt orthogonalization: Given a set of linearly independent vectors \(\{ a_0, \ldots, a_{n-1} \} \subset \Cm \text{,}\) the Gram-Schmidt process computes an orthonormal basis \(\{ q_0, \ldots, q_{n-1} \} \) that spans the same subspace as the original vectors, i.e.

\begin{equation*} \Span ( \{ a_0, \ldots, a_{n-1} \} ) = \Span ( \{ q_0, \ldots, q_{n-1} \} ) . \end{equation*}

The process proceeds as follows:

  • Compute vector \(q_0 \) of unit length so that \(\Span( \{ a_0 \} ) = \Span( \{ q_0 \} ) \text{:}\)

    • \(\rho_{0,0} = \| a_0 \|_2 \)

      Computes the length of vector \(a_0 \text{.}\)

    • \(q_0 = a_0 / \rho_{0,0} \)

      Sets \(q_0 \) to a unit vector in the direction of \(a_0 \text{.}\)

    Notice that \(a_0 = q_0 \rho_{0,0} \)

  • Compute vector \(q_1 \) of unit length so that \(\Span( \{ a_0, a_1 \} ) = \Span( \{ q_0, q_1 \} ) \text{:}\)

    • \(\rho_{0,1} = q_0^H a_1 \)

      Computes \(\rho_{0,1} \) so that \(\rho_{0,1} q_0 = q_0^H a_1 q_0 \) equals the component of \(a_1 \) in the direction of \(q_0 \text{.}\)

    • \(a_1^\perp = a_1 - \rho_{0,1} q_0 \)

      Computes the component of \(a_1 \) that is orthogonal to \(q_0 \text{.}\)

    • \(\rho_{1,1} = \| a_1^\perp \|_2 \)

      Computes the length of vector \(a_1^\perp \text{.}\)

    • \(q_1 = a_1^\perp / \rho_{1,1} \)

      Sets \(q_1 \) to a unit vector in the direction of \(a_1^\perp \text{.}\)

    Notice that

    \begin{equation*} \left( \begin{array}{c | c} a_0 \amp a_1 \end{array} \right) = \left( \begin{array}{c | c} q_0 \amp q_1 \end{array} \right) \left( \begin{array}{c | c} \rho_{0,0} \amp \rho_{0,1} \\ \hline 0 \amp \rho_{1,1} \end{array} \right) . \end{equation*}
  • Compute vector \(q_2 \) of unit length so that \(\Span( \{ a_0, a_1, a_2 \} ) = \Span( \{ q_0, q_1, q_2 \} ) \text{:}\)

    • \(\begin{array}{l} \rho_{0,2} = q_0^H a_2 \\ \rho_{1,2} = q_1^H a_2 \end{array} \mbox{ or, equivalently, } \left( \begin{array}{c} \rho_{0,2} \\ \rho_{1,2} \end{array} \right) = \left( \begin{array}{c c} q_0 \amp q_1 \end{array} \right)^H a_2\)

      Computes \(\rho_{0,2} \) so that \(\rho_{0,2} q_0 = q_0^H a_2 q_0 \) and \(\rho_{1,2} q_1 = q_1^H a_2 q_1 \) equal the components of \(a_2 \) in the directions of \(q_0 \) and \(q_1 \text{.}\)

      Or, equivalently, \(\left( \begin{array}{c c} q_0 \amp q_1 \end{array} \right) \left( \begin{array}{c} \rho_{0,2} \\ \rho_{1,2} \end{array} \right) \) is the component in \(\Span( \{ q_0, q_1 \} ) \text{.}\)

    • \(a_2^\perp = a_2 - \rho_{0,2} q_0 - \rho_{1,2} q_1 = a_2 - \left( \begin{array}{c c} q_0 \amp q_1 \end{array} \right) \left( \begin{array}{c} \rho_{0,2} \\ \rho_{1,2} \end{array} \right) \)

      Computes the component of \(a_2 \) that is orthogonal to \(q_0 \) and \(q_1 \text{.}\)

    • \(\rho_{2,2} = \| a_2^\perp \|_2 \)

      Computes the length of vector \(a_2^\perp \text{.}\)

    • \(q_2 = a_2^\perp / \rho_{2,2} \)

      Sets \(q_2 \) to a unit vector in the direction of \(a_2^\perp \text{.}\)

    Notice that

    \begin{equation*} \left( \begin{array}{c c | c} a_0 \amp a_1 \amp a_2 \end{array} \right) = \left( \begin{array}{c c | c} q_0 \amp q_1 \amp q_2 \end{array} \right) \left( \begin{array}{c c | c} \rho_{0,0} \amp \rho_{0,1} \amp \rho_{0,2} \\ 0 \amp \rho_{1,1} \amp \rho_{1,2} \\ \hline 0 \amp 0 \amp \rho_{2,2} \end{array} \right) . \end{equation*}
  • And so forth.

Projection a vector \(y \) onto the orthonormal columns of \(Q \in \Cmxn \text{:}\)

\begin{equation*} \begin{array}{|l | l|} \hline [ y^\perp, r ] = {\rm Proj\!\!\perp\!\! Q}_{\rm CGS}( Q, y ) ~~~~~~~~~~~~ \amp [ y^\perp, r ] = {\rm Proj\!\!\perp\!\! Q}_{\rm MGS}( Q, y ) ~~~~~~~~~~~~ \\ {\rm (used~by~CGS)} \amp {\rm (used~by~MGS)} \\ \hline y^\perp = y \amp y^\perp = y \\ {\bf for~} i=0, \ldots, k-1 \amp {\bf for~} i=0, \ldots, k-1 \\ ~~~\rho_i := q_i^H y \amp ~~~ \rho_i := q_i^H y^\perp \\ ~~~ y^\perp := y^\perp - \rho_i q_i \amp ~~~ y^\perp := y^\perp - \rho_i q_i \\ {\bf endfor} \amp {\bf endfor} \\ \hline \end{array} \end{equation*}

Gram-Schmidt orthogonalization algorithms:

\begin{equation*} \newcommand{\operation}{ \left[Q, R \right] := \mbox{Gram-Schmidt}( A ) } \newcommand{\routinename}{ \left[ A, R \right] := \mbox{GS}( A ) \quad \mbox{ (overwrites $ A $ with $ Q $)}} % Step 3: Loop-guard \newcommand{\guard}{ n( A_L ) \lt n( A ) } % Step 4: Initialize \newcommand{\partitionings}{ A \rightarrow \FlaOneByTwo{A_L}{A_R} , R \rightarrow \FlaTwoByTwo{R_{TL}}{R_{TR}} {0}{R_{BR}} } \newcommand{\partitionsizes}{ A_L {\rm ~has~} 0 {\rm ~columns~and~} R_{TL} {\rm ~is~} 0 \times 0 } % Step 5a: Repartition the operands \newcommand{\repartitionings}{ \FlaOneByTwo{A_L}{A_R} \rightarrow \FlaOneByThreeR{A_0}{a_1}{A_2} , \FlaTwoByTwo{R_{TL}}{R_{TR}} {0}{R_{BR}} \rightarrow \FlaThreeByThreeBR{R_{00}}{r_{01}}{R_{02}} {0}{\rho_{11}}{r_{12}^T} {0}{0}{R_{22}} } % \newcommand{\repartitionsizes}{ % a_1 {\rm ~and~} q_1 {\rm ~are~columns,~} % \rho_{11} {\rm is~a~scalar} % } % Step 5b: Move the double lines \newcommand{\moveboundaries}{ \FlaOneByTwo{A_L}{A_R} \leftarrow \FlaOneByThreeL{A_0}{a_1}{A_2} , \FlaTwoByTwo{R_{TL}}{R_{TR}} {0}{R_{BR}} \leftarrow \FlaThreeByThreeTL{R_{00}}{r_{01}}{R_{02}} {0}{\rho_{11}}{r_{12}^T} {0}{0}{R_{22}} } % Step 8: Insert the updates required to change the % state from that given in Step 6 to that given in Step 7 % Note: The below needs editing!!! \newcommand{\update}{ \begin{array}{l | l | l} \hline {\rm CGS} \amp {\rm MGS} \amp {\rm MGS~(alternative)} \\ r_{01} := A_0^H a_1 \amp \amp \\ a_1 := a_1 - A_0 r_{01} \amp [ a_1, r_{01} ] = {\rm Proj\!\!\perp\!\! toQ}_{\rm MGS}( A_0, a_1 ) \amp \\ \rho_{11} := \| a_1 \|_2 \amp \rho_{11} := \| a_1 \|_2 \amp \rho_{11} := \| a_1 \|_2 \\ a_1 := a_1 / \rho_{11} \amp q_1 := a_1 / \rho_{11} \amp a_1 := a_1 / \rho_{11} \\ \amp \amp r_{12}^T := a_1^H A_2 \\ \amp \amp A_2 := A_2 - a_1 r_{12}^T \\ \hline \end{array} } \FlaAlgorithm % this command generates the worksheet using the % commands that were defined between the \resetsteps % and the worksheet command \end{equation*}

Classic example that shows that the columns of \(Q \text{,}\) computed by MGS, are "more orthogonal" than those computed by CGS:

\begin{equation*} A = \left( \begin{array}{c | c | c } 1 \amp 1 \amp 1 \\ \epsilon \amp 0 \amp 0 \\ 0 \amp \epsilon \amp 0 \\ 0 \amp 0 \amp \epsilon \end{array} \right) = \left( \begin{array}{c | c | c } a_0 \amp a_1 \amp a_2 \end{array} \right). \end{equation*}

Cost of Gram-Schmidt algorithms: approximately \(2 m n^2\) flops.

Definition 3.5.2.2.

Let \(u \in \Cn \) be a vector of unit length (\(\| u \|_2 = 1 \)). Then \(H = I - 2 u u^H \) is said to be a Householder transformation or (Householder) reflector.

If \(H \) is a Householder transformation (reflector), then

  • \(H H = I \text{.}\)

  • \(H = H^H \text{.}\)

  • \(H^H H = H H^ = I \text{.}\)

  • \(H^{-1} = H^H = H \text{.}\)

Computing a Householder transformation \(I - 2 u u^H \text{:}\)

  • Real case:

    • \(v = x \mp \| x \|_2 e_0 \text{.}\)

      \(v = x + \sign( \chi_1 ) \| x \|_2 e_0 \) avoids catastrophic cancellation.

    • \(u = v / \| v \|_2 \)

  • Complex case:

    • \(v = x \mp \complexone \| x \|_2 e_0 \text{.}\)

      (Picking \(\complexone \) carefully avoids catastrophic cancellation.)

    • \(u = v / \| v \|_2 \)

Practical computation of \(u \) and \(\tau \) so that \(I - u u^H / \tau \) is a Householder transformation (reflector):

\begin{equation*} \begin{array}{|l|} \hline {\rm Algorithm:~} \left[ \FlaTwoByOneSingleLine {\rho} {u_2} , \tau \right] = {\rm Housev} \left( \FlaTwoByOneSingleLine {\chi_1} {x_2} \right) \\ \hline \begin{array}{l | l} \amp \chi_2 := \| x_2 \|_2 \\ \amp \alpha := \left\| \left( \begin{array}{c} \chi_1 \\ \chi_2 \end{array} \right) \right\|_2 (= \| x \|_2) \\ \rho = - \sign( \chi_1 ) \| x \|_2 \amp \rho := - \sign( \chi_1 ) \alpha \\ \nu_1 = \chi_1 + \sign( \chi_1 ) \| x \|_2 \amp \nu_1 := \chi_1 - \rho \\ u_2 = x_2 / \nu_1 \amp u_2 := x_2 / \nu_1 \\ \amp \chi_2 = \chi_2 / \vert \nu_1 \vert ( = \| u_2 \|_2 ) \\ \tau = ( 1 + u_2^H u_2 ) / 2 \amp \tau = ( 1 + \chi_2^2 ) / 2 \end{array} \\ \hline \end{array} \end{equation*}

Householder QR factorization algorithm:

\begin{equation*} \newcommand{\routinename}{[ A, t ] = {\rm HQR\_unb\_var1}( A )} \newcommand{\guard}{n( A_{BR} ) > 0} \newcommand{\partitionings}{ A \rightarrow \FlaTwoByTwo{ A_{TL} } {A_{TR}} { A_{BL} } { A_{BR} } {\rm ~and~} t \rightarrow \FlaTwoByOne{ t_{T} } { t_{B} } } \newcommand{\partitionsizes}{ A_{TL} {\rm ~is~} 0 \times 0 {\rm ~and~} t_T {\rm ~has~} 0 {\rm ~elements} } \newcommand{\repartitionings}{ \FlaTwoByTwo{ A_{TL} } {A_{TR}} { A_{BL} } { A_{BR} } \rightarrow \FlaThreeByThreeBR{ A_{00} }{ a_{01} }{ A_{02} } { a_{10}^T }{ \alpha_{11} }{ a_{12}^T } { A_{20} }{ a_{21} }{ A_{22} } {\rm ~and~} \FlaTwoByOne{ t_{T} } { t_{B} } \rightarrow \FlaThreeByOneB{ t_{0} } { \tau_{1} } { t_{2} } } \newcommand{\moveboundaries}{ \FlaTwoByTwo{ A_{TL} } {A_{TR}} { A_{BL} } { A_{BR} } \leftarrow \FlaThreeByThreeTL{ A_{00} }{ a_{01} }{ A_{02} } { a_{10}^T }{ \alpha_{11} }{ a_{12}^T } { A_{20} }{ a_{21} }{ A_{22} } {\rm ~and~} \FlaTwoByOne{ t_{T} } { t_{B} } \leftarrow \FlaThreeByOneT{ t_{0} } { \tau_{1} } { t_{2} } } \newcommand{\update}{ \begin{array}{l} \left[ \FlaTwoByOneSingleLine { \alpha_{11} } { a_{21}}, \tau_1 \right] := \left[ \FlaTwoByOneSingleLine { \rho_{11} } { u_{21}}, \tau_1 \right] = {\rm Housev} \left( \begin{array}{c} \alpha_{11} \\ a_{21} \end{array} \right) \\ {\rm Update~} \left( \begin{array}{c} a_{12}^T \\ A_{22} \end{array} \right) := \left( I - \frac{1}{\tau_1} \left( \begin{array}{c} 1 \\ u_{21} \end{array} \right) \left( \begin{array}{c c } 1 \amp u_{21}^H \end{array} \right) \right) \left( \begin{array}{c} a_{12}^T \\ A_{22} \end{array} \right) \\ ~~~~~{\rm ~via~the~steps~} \\ ~~~~~~~~~~~~~~~ w_{12}^T := ( a_{12}^T + a_{21}^H A_{22} )/\tau_1 \\ ~~~~~~~~~~~~~~~\FlaTwoByOneSingleLine { a_{12}^T } { A_{22} } := \FlaTwoByOneSingleLine { a_{12}^T - { w_{12}^T } } { A_{22} - { a_{21} } { w_{12}^T } } \end{array} } \FlaAlgorithm \end{equation*}

Cost: approximately \(2 m n^2 - \frac{2}{3} n^3 \) flops.

Algorithm for forming \(Q \) given output of Householder QR factorization algorithm:

\begin{equation*} \newcommand{\routinename}{[ A ] = {\rm FormQ}( A, t )} \newcommand{\guard}{n( A_{TL} ) > 0} \newcommand{\partitionings}{ A \rightarrow \FlaTwoByTwo{ A_{TL} } { A_{TR}} { A_{BL} } { A_{BR} }, t \rightarrow \FlaTwoByOne{ t_{T} } { t_{B} } } \newcommand{\partitionsizes}{ A_{TL} {\rm ~is~} n(A) \times n(A) {\rm ~and~} t_T {\rm ~has~} n(A) {\rm ~elements} } \newcommand{\repartitionings}{ \FlaTwoByTwo{ A_{TL} } {A_{TR}} { A_{BL} } { A_{BR} } \rightarrow \FlaThreeByThreeTL{ A_{00} }{ a_{01} }{ A_{02} } { a_{10}^T }{ \alpha_{11} }{ a_{12}^T } { A_{20} }{ a_{21} }{ A_{22} }, \FlaTwoByOne{ t_{T} } { t_{B} } \rightarrow \FlaThreeByOneT{ t_{0} } { \tau_{1} } { t_{2} } } \newcommand{\moveboundaries}{ \FlaTwoByTwo{ A_{TL} } {A_{TR}} { A_{BL} } { A_{BR} } \leftarrow \FlaThreeByThreeBR{ A_{00} }{ a_{01} }{ A_{02} } { a_{10}^T }{ \alpha_{11} }{ a_{12}^T } { A_{20} }{ a_{21} }{ A_{22} }, \FlaTwoByOne{ t_{T} } { t_{B} } \leftarrow \FlaThreeByOneB{ t_{0} } { \tau_{1} } { t_{2} } } \newcommand{\update}{ \begin{array}{l} {\rm Update~} \left( \begin{array}{c | c} \alpha_{11} \amp a_{12}^T \\ \hline a_{21} \amp A_{22} \end{array} \right) := \\ ~~~~~~~~~~~~~~~ \left( I - \frac{1}{\tau_1} \left( \begin{array}{c} 1 \\ \hline u_{21} \end{array} \right) \left( \begin{array}{c | c } 1 \amp u_{21}^H \end{array} \right) \right) \left( \begin{array}{c | c} 1 \amp 0 \\ \hline 0 \amp A_{22} \end{array} \right) \\ ~~~~{\rm via~the~steps} \\ ~~~~~~~~~~{ \alpha_{11} } := { 1 - 1/\tau_1 } \\ ~~~~~~~~~~a_{12}^T := - ( a_{21}^H A_{22} )/\tau_1 \\ ~~~~~~~~~~{ A_{22} } := { A_{22} + { a_{21} } { a_{12}^T } }\\ ~~~~~~~~~~{ a_{21} } := { - a_{21}/\tau_1 } \end{array} } \FlaAlgorithm \end{equation*}

Cost: approximately \(2 m n^2 - \frac{2}{3} n^3 \) flops.

Algorithm for applying \(Q^H \) given output of Householder QR factorization algorithm:

\begin{equation*} \newcommand{\routinename}{[ y ] = {\rm Apply\_QH}( A, t, y )} \newcommand{\guard}{n( A_{BR} ) \lt 0} \newcommand{\partitionings}{ A \rightarrow \FlaTwoByTwo{ A_{TL} } {A_{TR}} { A_{BL} } { A_{BR} } , t \rightarrow \FlaTwoByOne{ t_{T} } { t_{B} } , y \rightarrow \FlaTwoByOne{ y_{T} } { y_{B} } } \newcommand{\partitionsizes}{ A_{TL} {\rm ~is~} 0 \times 0 {\rm ~and~} t_T, y_T {\rm ~have~} 0 {\rm ~elements}} \newcommand{\repartitionings}{ \FlaTwoByTwo{ A_{TL} } {A_{TR}} { A_{BL} } { A_{BR} } \rightarrow \FlaThreeByThreeBR{ A_{00} }{ a_{01} }{ A_{02} } { a_{10}^T }{ \alpha_{11} }{ a_{12}^T } { A_{20} }{ a_{21} }{ A_{22} } ,\\ \FlaTwoByOne{ t_{T} } { t_{B} } \rightarrow \FlaThreeByOneB{ t_{0} } { \tau_{1} } { t_{2} } , \FlaTwoByOne{ y_{T} } { y_{B} } \rightarrow \FlaThreeByOneB{ y_{0} } { \psi_{1} } { y_{2} } } \newcommand{\moveboundaries}{ \FlaTwoByTwo{ A_{TL} } {A_{TR}} { A_{BL} } { A_{BR} } \leftarrow \FlaThreeByThreeTL{ A_{00} }{ a_{01} }{ A_{02} } { a_{10}^T }{ \alpha_{11} }{ a_{12}^T } { A_{20} }{ a_{21} }{ A_{22} } , \\ \FlaTwoByOne{ t_{T} } { t_{B} } \leftarrow \FlaThreeByOneT{ t_{0} } { \tau_{1} } { t_{2} } , \FlaTwoByOne{ y_{T} } { y_{B} } \leftarrow \FlaThreeByOneT{ y_{0} } { \psi_{1} } { y_{2} } } \newcommand{\update}{ \begin{array}{l} {\rm Update~} \left( \begin{array}{c} \psi_1 \\ y_{2} \end{array} \right) := \left( I - \frac{1}{\tau_1} \left( \begin{array}{c} 1 \\ u_{21} \end{array} \right) \left( \begin{array}{c c } 1 \amp u_{21}^H \end{array} \right) \right) \left( \begin{array}{c} \psi_1 \\ y_{2} \end{array} \right) \\ ~~~~~~~~~~{\rm via~the~steps} \\ ~~~~~~~~~~~~~~~ \omega_{1} := ( \psi_{1} + a_{21}^H y_{2} )/\tau_1 \\ ~~~~~~~~~~~~~ \FlaTwoByOneSingleLine { \psi_{1} } { y_{2} } := \FlaTwoByOneSingleLine { \psi_{1} - { \omega_{1} } } { y_{2} - \omega_1 { u_{2} } } \end{array} } \FlaAlgorithm \end{equation*}

Cost: approximately \(4 m n - n^2 \) flops.