Skip to main content

Subsection 4.4.1 \(A \) has linearly independent columns

Since \(A = Q R \text{,}\) minimizing \(\| b - A x \|_2 \) means minimizing

\begin{equation*} \| b - Q \begin{array}[t]{c} \underbrace{R x}\\ z \end{array} \|_2. \end{equation*}

Since \(R \) is nonsingular, we can first find \(z \) that minimizes

\begin{equation*} \| b - Q z \|_2 \end{equation*}

after which we can solve \(R x = z \) for \(x \text{.}\) But from the Method of Normal Equations we know that the minimizing \(z\) solves

\begin{equation*} Q^H Q z = Q^H b. \end{equation*}

Since \(Q \) has orthonormal columns, we thus deduce that

\begin{equation*} z = Q^H b. \end{equation*}

Hence, the desired \(\hat x \) must satisfy

\begin{equation*} R \hat x = Q^H b \text{.} \end{equation*}

Let \(A = Q_L R_{TL} \) be the QR factorization of \(A \text{.}\) We know that then there exists a matrix \(Q_R \) such that \(Q = \FlaOneByTwo{ Q_L }{ Q_R } \) is unitary: \(Q_R \) is an orthonormal basis for the space orthogonal to the space spanned by \(Q_L \text{.}\) Now,

\begin{equation*} \begin{array}{l} \min_{x \in \Cn} \| b - A x \|_2^2 \\ ~~~=~~~~ \lt \mbox{ substitute } A = Q_L R_{TL} \gt\\ \min_{x \in \Cn} \| b - Q_L R_{TL} x \|_2^2 \\ ~~~=~~~~ \lt \mbox{ two norm is preserved since } Q^H \mbox{ is unitary } \gt \\ \min_{x \in \Cn} \| Q^H ( b - Q_L R_{TL} x ) \|_2^2 \\ ~~~=~~~~ \lt \mbox{ partitioning; distributing } \gt \\ \min_{x \in \Cn} \left\| \FlaTwoByOne{Q_L^H}{Q_R^H} b - \FlaTwoByOne{Q_L^H}{Q_R^H} Q_L R_{TL} x \right\|_2^2 \\ ~~~=~~~~ \lt \mbox{ partitioned matrix-matrix multiplication } \gt \\ \min_{x \in \Cn} \left\| \FlaTwoByOne{Q_L^Hb}{Q_R^Hb} - \FlaTwoByOne{R_{TL} x}{0 } \right\|_2^2 \\ ~~~=~~~~\lt \mbox{ partitioned matrix addition} \gt \\ \min_{x \in \Cn} \left\| \FlaTwoByOne{Q_L^Hb - R_{TL} x}{Q_R^Hb} \right\|_2^2 \\ ~~~=~~~~ \lt \mbox{ property of the 2-norm: } \left\| \left( \begin{array}{c} u \\ \hline v \end{array} \right) \right\|_2^2 = \| u \|_2^2 + \| v \|_2^2 ) \gt \\ \min_{x \in \Cn} \left( \left\| Q_L^H b - R_{TL} x \right\|_2^2 + \| Q_R^H b \|_2^2 \right) \\ ~~~=~~~~ \lt Q_R^H b \mbox{ is independent of } x \gt \\ \left( \min_{x \in \Cn} \left\| Q_L^H b - R_{TL} x \right\|_2^2 \right) + \| Q_R^H b \|_2^2 \\ ~~~=~~~~ \lt \mbox{ minimized by } \hat x \mbox{ that satisfies } R_{TL} \hat x = Q_L^Hb \gt \\ \| Q_R^H b \|_2^2. \end{array} \end{equation*}

Thus, the desired \(\hat x \) that minimizes the linear least squares problem solves \(R_{TL} \hat x = Q_L^H y \text{.}\) The solution is unique because \(R_{TL} \) is nonsingular (because \(A \) has linearly independent columns).

Homework 4.4.1.1.

Yet another alternative proof for Theorem 4.4.1.1 starts with the observation that the solution is given by \(\hat x = (A^H A)^{-1} A^H b \) and then substitutes in \(A = Q R \text{.}\) Give a proof that builds on this insight.

Solution

Recall that we saw in Subsection 4.2.2 that, if \(A \) has linearly independent columns, the LLS solution is given by \(\hat x = ( A^H A )^{-1} A^H b\) (the solution to the normal equations). Also, if \(A \) has linearly independent columns and \(A = Q R \) is its QR factorization, then the upper triangular matrix \(R \) is nonsingular (and hence has no zeroes on its diagonal).

Now,

\begin{equation*} \begin{array}{l} \hat x \\ ~~~=~~~~\lt {\rm Solution~to~the~Normal~Equations} \gt \\ ( A^H A )^{-1} A^H b \\ ~~~=~~~~ \lt A = Q R \gt \\ \left[ ( Q R)^H ( Q R ) \right]^{-1} ( Q R )^H b \\ ~~~ = ~~~~ \lt ( B C )^H = ( C^H B^H ) \gt \\ \left[ R^H Q^H Q R \right]^{-1} R^H Q^H b \\ ~~~ = ~~~~ \lt Q^H Q = I \gt \\ \left[ R^H R \right]^{-1} R^H Q^H b \\ ~~~~ = ~~~~ \lt ( B C )^{-1} = C^{-1} B^{-1} \gt \\ R^{-1} R^{-H} R^H Q^H b \\ ~~~ = ~~~~ \lt R^{-H} R^H = I \gt \\ R^{-1} Q^H b . \end{array} \end{equation*}

Thus, the \(\hat x \) that solves \(R \hat x = Q^H b \) solves the LLS problem.

Ponder This 4.4.1.2.

Create a picture similar to Figure 4.3.2.1 that uses the QR factorization rather than the SVD.