Skip to main content

Subsection 4.3.3 Case 2: General case

Now we show how to use the SVD to find \(\widehat x \) that satisfies

\begin{equation*} \| b - A \widehat x \|_2 = \min_x \| b - A x \|_2, \end{equation*}

where \(\rank(A) = r \text{,}\) with no assumptions about the relative size of \(m \) and \(n \text{.}\) In our discussion, we let \(A = U_L \Sigma_{TL} V_L^H \) equal its Reduced SVD and

\begin{equation*} A = \left( \begin{array}{c | c} U_L \amp U_R \end{array} \right) \left( \begin{array}{c | c} \Sigma_{TL} \amp 0 \\ \hline 0 \amp 0 \end{array} \right) \left( \begin{array}{c | c} V_L \amp V_R \end{array} \right)^H \end{equation*}

its SVD.

The first observation is, once more, that an \(\widehat x\) that minimizes \(\| b - A x \|_2 \) satisfies

\begin{equation*} A \widehat x = \widehat b, \end{equation*}

where \(\widehat b = U_L U_L^H b \text{,}\) the orthogonal projection of \(b\) onto the column space of \(A \text{.}\) Notice our use of "an \(\widehat x \)" since the solution won't be unique if \(r \lt m \) and hence the null space of \(A \) is not trivial. Substituting in the SVD this means that

\begin{equation*} \left( \begin{array}{c | c} U_L \amp U_R \end{array} \right) \left( \begin{array}{c | c} \Sigma_{TL} \amp 0 \\ \hline 0 \amp 0 \end{array} \right) \left( \begin{array}{c | c} V_L \amp V_R \end{array} \right)^H \widehat x = U_L U_L^H b. \end{equation*}

Multiplying both sides by \(U_L^H \) yields

\begin{equation*} \left( \begin{array}{c | c} I \amp 0 \end{array} \right) \left( \begin{array}{c | c} \Sigma_{TL} \amp 0 \\ \hline 0 \amp 0 \end{array} \right) \left( \begin{array}{c | c} V_L \amp V_R \end{array} \right)^H \widehat x = U_L^H b \end{equation*}

or, equivalently,

\begin{equation} \Sigma_{TL} V_L^H \widehat x = U_L^H b. \label{chapter04-LLS-general-1}\tag{4.3.1} \end{equation}

Any solution to this can be written as the sum of a vector in the row space of \(A \) with a vector in the null space of \(A \text{:}\)

\begin{equation*} \widehat x = V z = \left( \begin{array}{c | c } V_L \amp V_R \end{array} \right) \left( \begin{array}{c}{z_T}\\ \hline {z_B} \end{array}\right) = \begin{array}[t]{c} \underbrace{ V_L z_T} \\ x_r \end{array} + \begin{array}[t]{c} \underbrace{ V_R z_B} \\ x_n \end{array}. \end{equation*}

Substituting this into (4.3.1) we get

\begin{equation*} \Sigma_{TL} V_L^H ( V_L z_T + V_R z_B ) = U_L^H b , \end{equation*}

which leaves us with

\begin{equation*} \Sigma_{TL} z_T = U_L^H b . \end{equation*}

Thus, the solution in the row space is given by

\begin{equation*} x_r = V_L z_T = V_L \Sigma_{TL}^{-1} U_L^H b \end{equation*}

and the general solution is given by

\begin{equation*} \widehat x = V_L \Sigma_{TL}^{-1} U_L^H b + V_R z_B, \end{equation*}

where \(z_B \) is any vector in \(\C^{n-r} \text{.}\) This reasoning is captured in Figure 4.3.3.1.

images/Chapter04/FundamentalSpacesSVDLLS.pptx
Figure 4.3.3.1. Solving LLS via the SVD of \(A \text{.}\)
Homework 4.3.3.1.

Reason that

\begin{equation*} \widehat x = V_L \Sigma_{TL}^{-1} U_L^H b \end{equation*}

is the solution to the LLS problem with minimal length (2-norm). In other words, if \(x^\star \) satisfies

\begin{equation*} \| b - A x^\star \|_2 = \min_x \| b - A x \|_2 \end{equation*}

then \(\| \widehat x \|_2 \leq \| x^\star \|_2 \text{.}\)

Solution

The important insight is that

\begin{equation*} x^\star = \begin{array}[t]{c} \underbrace{V_L \Sigma_{TL}^{-1} U_L^H b}\\ \widehat x \end{array} + V_R z_B \end{equation*}

and that

\begin{equation*} V_L \Sigma_{TL}^{-1} U_L^H b \quad \mbox{and} \quad V_R z_B \end{equation*}

are orthogonal to each other (since \(V_L^H V_R = 0\)). If \(u^H v = 0 \) then \(\| u + v \|_2^2 = \| u \|_2^2 + \| v \|_2^2 \text{.}\) Hence

\begin{equation*} \| x^\star \|_2^2 = \| \widehat x + V_R z_B \|_2^2 = \| \widehat x \|_2^2 + \| V_R z_B \|_2^2 \geq \| \widehat x \|_2^2 \end{equation*}

and hence \(\| \widehat x \|_2 \leq \| x^\star \|_2 \text{.}\)