Subsection4.2.4Conditioning of the linear least squares problem

Given $A \in \Cmxn$ with linearly independent columns and $b \in \Cm \text{,}$ consider the linear least squares (LLS) problem

$$\| b - A \widehat x \|_2 = \min_x \| b - A x \|_2 \label{eqn-LLS1}\tag{4.2.1}$$

and the perturbed problem

$$\| (b + \delta\!b ) - A (\widehat x+\delta\! \widehat x ) \|_2 = \min_{x} \| ( b + \delta\!b ) - A (x+\delta\!x ) \|_2 \label{eqn-LLS2}\tag{4.2.2}$$

The question we want to examine is by how much the relative error in $b$ is amplified into a relative error in $\widehat x \text{.}$ We will restrict our discussion to the case where $A$ has linearly independent columns.

Now, we discovered that $\widehat b \text{,}$ the projection of $b$ onto the column space of $A \text{,}$ satisfies

$$\widehat b = A \widehat x \label{eqn-LLS-cond1}\tag{4.2.3}$$

and the projection of $b + \delta\! b$ satisfies

$$\widehat b + \delta\! \widehat b = A ( \widehat x + \delta\! \widehat x ) \label{eqn-LLS-cond2}\tag{4.2.4}$$

where $\delta\! \widehat b$ equals the projection of $\delta\!b$ onto the column space of $A \text{.}$

Let $\theta$ equal the angle between vectors $b$ and its projection $\widehat b$ (which equals the angle between $b$ and the column space of $A$). Then

\begin{equation*} \cos(\theta) = \| \widehat b \|_2 / \| b \|_2 \end{equation*}

and hence

\begin{equation*} \cos(\theta) \| b \|_2 = \| \widehat b \|_2 = \| A \widehat x \|_2 \leq \| A \|_2 \| \widehat x \|_2 = \sigma_0 \| \widehat x \|_2 \end{equation*}

which (as long as $\widehat x \neq 0$) can be rewritten as

$$\frac{1}{\| \widehat x \|_2} \leq \frac{\sigma_0}{\cos( \theta )} \frac{1}{\| b \|_2}. \label{eqn-LLS-cond3}\tag{4.2.5}$$

Subtracting (4.2.3) from (4.2.4) yields

\begin{equation*} \delta\! \widehat b = A \delta\! \widehat x \end{equation*}

or, equivalently,

\begin{equation*} A \delta\! \widehat x = \delta\! \widehat b \end{equation*}

which is solved by

\begin{equation*} \begin{array}{rcl} \delta\! \widehat x \amp = \amp A^\dagger \delta\! \widehat b \\ \amp = \amp A^\dagger A ( A^H A )^{-1} A^H \delta b \\ \amp = \amp ( A^H A )^{-1} A^H A ( A^H A )^{-1} A^H \delta b \\ \amp = \amp A^\dagger \delta\! b , \end{array} \end{equation*}

where $A^\dagger = ( A^H A )^{-1} A^H$ is the pseudo inverse of $A$ and we recall that $\delta\! \widehat b = A ( A^H A )^{-1} A^H \delta b \text{.}$ Hence

$$\| \delta\! \widehat x \|_2 \leq \| A^\dagger \|_2 \| \delta\! b \|_2.\label{eqn-LLS-cond4}\tag{4.2.6}$$
Homework4.2.4.1.

Let $A \in \Cmxn$ have linearly independent columns. Show that

\begin{equation*} \| ( A^H A )^{-1} A^H \|_2 = 1/\sigma_{n-1}, \end{equation*}

where $\sigma_{n-1}$ equals the smallest singular value of $A \text{.}$

Hint

Use the reduced SVD of $A \text{.}$

Solution

Let $A = U_L \Sigma_{TL} V^H$ be the reduced SVD of $A \text{,}$ where $V$ is square because $A$ has linearly independent columns. Then

\begin{equation*} \begin{array}{l} \| ( A^H A )^{-1} A^H \|_2 \\ ~~~=~~~~\\ \| ( ( U_L \Sigma_{TL} V^H )^H U_L \Sigma_{TL} V^H )^{-1} ( U_L \Sigma_{TL} V^H )^H \|_2 \\ ~~~=~~~~\\ \| ( V \Sigma_{TL} U_L^H U_L \Sigma_{TL} V^H )^{-1} V \Sigma_{TL} U_L^H \|_2 \\ ~~~=~~~~\\ \| ( V \Sigma_{TL}^{-1} \Sigma_{TL}^{-1} V^H ) V \Sigma_{TL} U_L^H \|_2 \\ ~~~=~~~~\\ \| V \Sigma_{TL}^{-1} U_L^H \|_2 \\ ~~~=~~~~\\ \| \Sigma_{TL}^{-1} U_L^H \|_2 \\ ~~~=~~~\\ 1/\sigma_{n-1}. \end{array} \end{equation*}

This last step needs some more explanation: Clearly $\| \Sigma_{TL} U_L^H \|_2 \leq \| \Sigma_{TL} \|_2 \| U_L^H \|_2 = \sigma_{0} \| U_L^H \|_2 \leq \sigma_0 \text{.}$ We need to show that there exists a vector $x$ with $\| x \|_2 = 1$ such that $\| \Sigma_{TL} U_L^H x \|_2 = \| \Sigma_{TL} U_L^H \|_2 \text{.}$ If we pick $x = u_0$ (the first column of $U_L$), then $\| \Sigma_{TL} U_L^H x \|_2 = \| \Sigma_{TL} U_L^H u_0 \|_2 = \| \Sigma_{TL} e_0 \|_2 = \| \sigma_0 e_0 \|_2 = \sigma_0 \text{.}$

Combining (4.2.5), (4.2.6), and the result in this last homework yields

$$\frac{\| \delta\! \widehat x\|_2}{\| \widehat x \|_2} \leq \frac{1}{\cos( \theta )} \frac{\sigma_0}{\sigma_{n-1}} \frac{\| \delta\! b \|_2}{\| b \|_2}. \label{eqn-LLS-cond5}\tag{4.2.7}$$

Notice the effect of the $\cos(\theta)b \text{.}$ If $b$ is almost perpendicular to ${\cal C}(A) \text{,}$ then its projection $\widehat b$ is small and $\cos \theta$ is small. Hence a small relative change in $b$ can be greatly amplified. This makes sense: if $b$ is almost perpendical to $\Col( A ) \text{,}$ then $\widehat x \approx 0 \text{,}$ and any small $\delta\!b \in \Col(A)$ can yield a relatively large change $\delta\!x \text{.}$

Definition4.2.4.1. Condition number of matrix with linearly independent columns.

Let $A \in \Cmxn$ have linearly independent columns (and hence $n \leq m$). Then its condition number (with respect to the 2-norm) is defined by

\begin{equation*} \kappa_2( A ) = \| A \|_2 \| A^\dagger \|_2 = \frac{\sigma_0}{\sigma_{n-1}}. \end{equation*}

It is informative to explicity expose $\cos( \theta ) = \| \widehat b \|_2/ \| b \|_2$ in (4.2.7):

\begin{equation*} \frac{\| \delta\! \widehat x\|_2}{\| \widehat x \|_2} \leq \frac{\| b \|_2}{\| \widehat b \|_2} \frac{\sigma_0}{\sigma_{n-1}} \frac{\| \delta\! b \|_2}{\| b \|_2}. \end{equation*}

Notice that the ratio

\begin{equation*} \frac{\| \delta\! b \|_2}{\| b \|_2} \end{equation*}

can be made smaller by adding a component, $b_r \text{,}$ to $b$ that is orthogonal to $\Col( A )$ (and hence does not change the projection onto the column space, $\widehat b$):

\begin{equation*} \frac{\| \delta\! b \|_2}{\| b + b_r \|_2}. \end{equation*}

The factor $1/\cos( \theta )$ ensures that this does not magically reduce the relative error in $\widehat x \text{:}$

\begin{equation*} \frac{\| \delta\! \widehat x\|_2}{\| \widehat x \|_2} \leq \frac{\| b + b_r \|_2}{\| \widehat b \|_2} \frac{\sigma_0}{\sigma_{n-1}} \frac{\| \delta\! b \|_2}{\| b + b_r \|_2}. \end{equation*}