Skip to main content

Unit 2.3.2 Geometric interpretation

We will now illustrate what the SVD Theorem tells us about matrix-vector multiplication (linear transformations) by examining the case where \(A \in \R^{2 \times 2} \text{.}\) Let \(A = U \Sigma V^T \) be its SVD. (Notice that all matrices are now real valued, and hence \(V^H = V^T \text{.}\)) Partition

\begin{equation*} A = \left( \begin{array}{c | c} u_0 \amp u_1 \end{array} \right) \left( \begin{array}{c | c} \sigma_0 \amp 0 \\ \hline 0 \amp \sigma_1 \end{array} \right) \left( \begin{array}{c | c} v_0 \amp v_1 \end{array} \right)^T. \end{equation*}

Since \(U \) and \(V \) are unitary matrices, \(\{ u_0, u_1 \} \) and \(\{ v_0, v_1 \} \) form orthonormal bases for the range and domain of \(A \text{,}\) respectively:

\(\R^{2} \text{:}\) Domain of \(A \text{:}\)

\(\R^{2} \text{:}\) Range (codomain) of \(A \text{:}\)

Let us manipulate the decomposition a little:

\begin{equation*} \begin{array}{rcl} A \amp=\amp \left( \begin{array}{c | c} u_0 \amp u_1 \end{array} \right) \left( \begin{array}{c | c} \sigma_0 \amp 0 \\ \hline 0 \amp \sigma_1 \end{array} \right) \left( \begin{array}{c | c} v_0 \amp v_1 \end{array} \right)^T \\ \amp = \amp \left[ \left( \begin{array}{c | c} u_0 \amp u_1 \end{array} \right) \left( \begin{array}{c | c} \sigma_0 \amp 0 \\ \hline 0 \amp \sigma_1 \end{array} \right) \right] \left( \begin{array}{c | c} v_0 \amp v_1 \end{array} \right)^T \\ \amp=\amp \left( \begin{array}{c | c} \sigma_0 u_0 \amp \sigma_1 u_1 \end{array} \right) \left( \begin{array}{c | c} v_0 \amp v_1 \end{array} \right)^T. \end{array} \end{equation*}

Now let us look at how \(A \) transforms \(v_0 \) and \(v_1 \text{:}\)

\begin{equation*} A v_0 = \left( \begin{array}{c | c} \sigma_0 u_0 \amp \sigma_1 u_1 \end{array} \right) \left( \begin{array}{c | c} v_0 \amp v_1 \end{array} \right)^T v_0 = \left( \begin{array}{c | c} \sigma_0 u_0 \amp \sigma_1 u_1 \end{array} \right) \left( \begin{array}{c} 1 \\ \hline 0 \end{array} \right) = \sigma_0 u_0 \end{equation*}

and similarly \(A v_1 = \sigma_1 u_1 \text{.}\) This motivates the pictures in FigureĀ 2.3.2.1.

\(\R^{2} \text{:}\) Domain of \(A \text{:}\)

\(\R^{2} \text{:}\) Range (codomain) of \(A \text{:}\)

\(\R^{2} \text{:}\) Domain of \(A \text{:}\)

\(\R^{2} \text{:}\) Range (codomain) of \(A \text{:}\)

Figure 2.3.2.1. Illustration of how orthonormal vectors \(v_0 \) and \(v_1 \) are transformed by matrix \(A = U \Sigma V \text{.}\)

Next, let us look at how \(A \) transforms any vector with (Euclidean) unit length. Notice that \(x = \left( \begin{array}{c} \chi_0 \\ \chi_1 \end{array} \right)\) means that

\begin{equation*} x = \chi_0 e_0 + \chi_1 e_1 , \end{equation*}

where \(e_0 \) and \(e_1 \) are the unit basis vectors. Thus, \(\chi_0 \) and \(\chi_1 \) are the coefficients when \(x \) is expressed using \(e_0 \) and \(e_1 \) as basis. However, we can also express \(x \) in the basis given by \(v_0 \) and \(v_1 \text{:}\)

\begin{equation*} \begin{array}{rcl} x \amp=\amp \begin{array}[t]{c} \underbrace{V V^T} \\ I \end{array} x = \left( \begin{array}{c | c} v_0 \amp v_1 \end{array} \right) \left( \begin{array}{c | c} v_0 \amp v_1 \end{array} \right)^T x = \left( \begin{array}{c | c} v_0 \amp v_1 \end{array} \right) \left( \begin{array}{c} v_0^T x \\ \hline v_1^T x \end{array} \right)\\ \amp=\amp \begin{array}[t]{c} \underbrace{ v_0^T x } \\ \alpha_0 \end{array} v_0 + \begin{array}[t]{c} \underbrace{ v_1^T x } \\ \alpha_1 \end{array} v_1 = \alpha_0 v_0 + \alpha_1 v_1 = \left( \begin{array}{c | c} v_0 \amp v_1 \end{array} \right) \left( \begin{array}{c} \alpha_0 \\ \alpha_1 \end{array} \right) . \end{array} \end{equation*}

Thus, in the basis formed by \(v_0 \) and \(v_1 \text{,}\) its coefficients are \(\alpha_0 \) and \(\alpha_1 \text{.}\) Now,

\begin{equation*} \begin{array}{rcl} A x \amp=\amp \left( \begin{array}{c | c} \sigma_0 u_0 \amp \sigma_1 u_1 \end{array} \right) \left( \begin{array}{c | c} v_0 \amp v_1 \end{array} \right)^T x \\ \amp = \amp \left( \begin{array}{c | c} \sigma_0 u_0 \amp \sigma_1 u_1 \end{array} \right) \left( \begin{array}{c | c} v_0 \amp v_1 \end{array} \right)^T \left( \begin{array}{c | c} v_0 \amp v_1 \end{array} \right) \left( \begin{array}{c} \alpha_0 \\ \alpha_1 \end{array} \right) \\ \amp=\amp \left( \begin{array}{c | c} \sigma_0 u_0 \amp \sigma_1 u_1 \end{array} \right) \left( \begin{array}{c} \alpha_0 \\ \alpha_1 \end{array} \right) = \alpha_0 \sigma_0 u_0 + \alpha_1 \sigma_1 u_1. \end{array} \end{equation*}

This is illustrated by the following picture, which also captures the fact that the unit ball is mapped to an oval with major axis equal to \(\sigma_0 = \| A \|_2 \) and minor axis equal to \(\sigma_1 \text{,}\) as illustrated in FigureĀ 2.3.2.1 (bottom).

Finally, we show the same insights for general vector \(x \) (not necessarily of unit length):

\(\R^{2} \text{:}\) Domain of \(A \text{:}\)

\(\R^{2} \text{:}\) Range (codomain) of \(A \text{:}\)

Another observation is that if one picks the right basis for the domain and codomain, then the computation \(A x \) simplifies to a matrix multiplication with a diagonal matrix. Let us again illustrate this for nonsingular \(A \in \R^{2 \times 2} \) with

\begin{equation*} A = \begin{array}[t]{c} \underbrace{ \left( \begin{array}{c | c} u_0 \amp u_1 \end{array} \right) } \\ U \end{array} \begin{array}[t]{c} \underbrace{ \left( \begin{array}{c | c} \sigma_0 \amp 0 \\ \hline 0 \amp \sigma_1 \end{array} \right) } \\ \Sigma \end{array} \begin{array}[t]{c} \underbrace{ \left( \begin{array}{c | c} v_0 \amp v_1 \end{array} \right) } \\ V \end{array} ^T. \end{equation*}

Now, if we chose to express \(y \) using \(u_0 \) and \(u_1 \) as the basis and express \(x \) using \(v_0 \) and \(v_1 \) as the basis, then

\begin{equation*} \begin{array}{rcl} \begin{array}[t]{c} \underbrace{U U^T} \\ I \end{array} y \amp=\amp U \begin{array}[t]{c} \underbrace{U^T y} \\ \widehat y \end{array} = ( u_0^T y ) u_0 + ( u_1^T y ) u_1 \\ \amp=\amp \left(\begin{array}{c | c} u_0 \amp u_1 \end{array} \right) \left( \begin{array}{c} u_0^T y \\ \hline u_1^T y \end{array}\right) = U \begin{array}[t]{c} \underbrace{ \left( \begin{array}{c} \widehat \psi_0 \\ \hline \widehat \psi_1 \end{array} \right)}\\ \widehat y \end{array} \\ \begin{array}[t]{c} \underbrace{V V^T} \\ I \end{array} x \amp=\amp V \begin{array}[t]{c} \underbrace{V^T x} \\ \widehat x \end{array} = ( v_0^T x ) v_0 + ( v_1^T x ) v_1 \\ \amp=\amp \left(\begin{array}{c | c} v_0 \amp v_1 \end{array} \right) \left( \begin{array}{c} v_0^T x \\ \hline v_1^T x \end{array}\right) = V \begin{array}[t]{c} \underbrace{ \left( \begin{array}{c} \widehat \chi_0 \\ \hline \widehat \chi_1. \end{array} \right)} \\ \widehat x \end{array}. \end{array} \end{equation*}

If \(y = A x \) then

\begin{equation*} U \begin{array}[t]{c} \underbrace{ U^T y } \\ \widehat y \end{array} = \begin{array}[t]{c} \underbrace{ U \Sigma V^T x } \\ A x \end{array} = U \Sigma \widehat x \end{equation*}

so that

\begin{equation*} \widehat y = \Sigma \widehat x \end{equation*}

and

\begin{equation*} \left( \begin{array}{c} \widehat \psi_0 \\ \hline \widehat \psi_1. \end{array} \right) = \left( \begin{array}{c} \sigma_0 \widehat \chi_0 \\ \hline \sigma_1 \widehat \chi_1. \end{array} \right). \end{equation*}
Remark 2.3.2.2.

The above discussion shows that if one transforms the input vector \(x \) and output vector \(y \) into the right bases, then the computation \(y := A x \) can be computed with a diagonal matrix instead: \(\widehat y := \Sigma \widehat x \text{.}\) Also, solving \(A x = y \) for \(x \) can be computed by multiplying with the inverse of the diagonal matrix: \(\widehat x := \Sigma^{-1} \widehat y \text{.}\)

These observations generalize to \(A \in \C^{m \times n}\text{:}\) If

\begin{equation*} y = A x \end{equation*}

then

\begin{equation*} U^H y = U^H A \begin{array}[t]{c} \underbrace{V V^H }\\ I \end{array} x \end{equation*}

so that

\begin{equation*} \begin{array}[t]{c} \underbrace{U^H y}\\ \widehat y \end{array} = \Sigma \begin{array}[t]{c} \underbrace{V^H x}\\ \widehat x \end{array} \end{equation*}

(\(\Sigma \) is a rectangular "diagonal" matrix.)