ALAFF \(A \) has linearly dependent columns

Subsection 4.4.4 \(A \) has linearly dependent columns

Let us now consider the case where \(A \in \Cmxn \) has rank \(r \leq n \text{.}\) In other words, it has \(r \) linearly independent columns. Let \(p \in \Rn \) be a permutation vector, by which we mean a permutation of the vector

\begin{equation*} \left( \begin{array}{c} 0 \\ 1 \\ \vdots \\ n-1 \end{array}\right) \end{equation*}

And \(P( p ) \) be the matrix that, when applied to a vector \(x \in \Cn \) permutes the entries of \(x \) according to the vector \(p \text{:}\)

\begin{equation*} P( p ) x = \begin{array}[t]{c} \underbrace{ \left( \begin{array}{c} e_{\pi_0}^T \\ e_{\pi_1}^T \\ \vdots \\ e_{\pi_{n-1}}^T \end{array} \right) } \\ P( p ) \end{array} x = \left( \begin{array}{c} e_{\pi_0}^T x\\ e_{\pi_1}^T x\\ \vdots \\ e_{\pi_{n-1}}^T x \end{array} \right) = \left( \begin{array}{c} \chi_{\pi_0} \\ \chi_{\pi_1} \\ \vdots \\ \chi_{\pi_{n-1}} \end{array} \right) . \end{equation*}

where \(e_j \) equals the columns of \(I \in \Rnxn \) indexed with \(j \) (and hence the standard basis vector indexed with \(j \)).

If we apply \(P( p )^T \) to \(A \in \Cmxn \) from the right, we get

\begin{equation*} \begin{array}{l} A P( p )^T \\ ~~~= ~~~~ \lt \mbox{ definition of } P( p ) \gt \\ A \left( \begin{array}{c} e_{\pi_0}^T \\ \vdots \\ e_{\pi_{n-1}}^T \end{array} \right)^T \\ ~~~= ~~~~ \lt \mbox{ transpose } \gt \\ A \left( \begin{array}{c | c | c } e_{\pi_0} \amp \cdots \amp e_{\pi_{n-1}} \end{array} \right) \\ ~~~= ~~~~ \lt \mbox{ matrix multiplication by columns } \gt \\ \left( \begin{array}{c | c | c } A e_{\pi_0} \amp \cdots \amp A e_{\pi_{n-1}} \end{array} \right) \\ ~~~= ~~~~ \lt B e_j = b_j \gt \\ \left( \begin{array}{c | c | c } a_{\pi_0} \amp \cdots \amp a_{\pi_{n-1}} \end{array} \right) . \end{array} \end{equation*}

In other words, applying the transpose of the permutation matrix to \(A \) from the right permutes its columns as indicated by the permutation vector \(p \text{.}\)

The discussion about permutation matrices gives us the ability to rearrange the columns of \(A \) so that the first \(r = \rank( A ) \) columns are linearly independent.

Theorem 4.4.4.1.

Assume \(A \in \Cmxn \) and that \(r = \rank( A ) \text{.}\) Then there exists a permutation vector \(p \in \Rn \text{,}\) orthonormal matrix \(Q_L \in \C^{m \times r} \text{,}\) upper triangular matrix \(R_{TL} \in \C^{r \times r} \text{,}\) and \(R_{TR} \in \C^{r \times (n-r)} \) such that

\begin{equation*} A P( p )^T = Q_L \left( \begin{array}{c | c } R_{TL} \amp R_{TR} \end{array} \right). \end{equation*}

Proof.

Let \(p \) be the permutation vector such that the first \(r \) columns of \(A^P = A P( p )^T \) are linearly independent. Partition

\begin{equation*} A^P = A P( p )^T = \left( \begin{array}{c | c } A_L^P \amp A_R^P \end{array} \right) \end{equation*}

where \(A_L^P \in \C^{m \times r} \text{.}\) Since \(A_L^P \) has linearly independent columns, its QR factorization, \(A^P = Q_L R_{TL} \text{,}\) exists. Since all the linearly independent columns of matrix \(A\) were permuted to the left, the remaining columns, now part of \(A_R^P \text{,}\) are in the column space of \(A_L^P \) and hence in the column space of \(Q_L \text{.}\) Hence \(A_R^P = Q_L R_{TR} \) for some matrix \(R_{TR} \text{,}\) which then must satisfy \(Q_L^H A_R^P = R_{TR} \) giving us a means by which to compute it. We conclude that

\begin{equation*} A^P = A P( p )^T = \left( \begin{array}{c | c } A_L^P \amp A_R^P \end{array} \right) = Q_L \left( \begin{array}{c | c } R_{TL} \amp R_{TR} \end{array} \right). \end{equation*}

Let us examine how this last theorem can help us solve the LLS

\begin{equation*} \mbox{Find } \hat x \in \Cn \mbox{ such that } \| b - A \hat x \|_2 = \min_{x \in \Cn} \| b - A x \|_2 \end{equation*}

when \(\rank( A ) \leq n \text{:}\)

\begin{equation*} \begin{array}{l} \min_{x \in \Cn} \| b - A x \|_2 \\ ~~~=~~~~ \lt P(p)^T P( p ) = I \gt \\ \min_{x \in \Cn} \| b - A P( p )^T P( p ) x \|_2 \\ ~~~=~~~~ \lt A P( p )^T = Q_L \left( \begin{array}{c | c } R_{TL} \amp R_{TR} \end{array} \right) \gt \\ \min_{x \in \Cn} \| b - Q_L \begin{array}[t]{c} \underbrace{ \left( \begin{array}{c | c } R_{TL} \amp R_{TR} \end{array} \right) P( p ) x } \\ w \end{array} \|_2 \\ ~~~=~~~~ \lt \mbox{ substitute } w = \left( \begin{array}{c | c } R_{TL} \amp R_{TR} \end{array} \right) P( p ) x \gt \\ \min_{w \in \C^r} \| b - Q_L w \|_2 \\ \end{array} \end{equation*}

which is minimized when \(w = Q_L^H b \text{.}\) Thus, we are looking for vector \(\hat x \) such that

\begin{equation*} \left( \begin{array}{c | c } R_{TL} \amp R_{TR} \end{array} \right) P( p ) \hat x = Q_L^H b . \end{equation*}

Substituting

\begin{equation*} z = \left( \begin{array}{c} z_T \\ \hline z_B \end{array} \right) \end{equation*}

for \(P(p) \hat x \) we find that

\begin{equation*} \left( \begin{array}{c | c } R_{TL} \amp R_{TR} \end{array} \right) \left( \begin{array}{c} z_T \\ \hline z_B \end{array} \right) = Q_L^H b . \end{equation*}

Now, we can pick \(z_B \in \C^{n-r} \) to be an arbitrary vector, and determine a corresponding \(z_T \) by solving

\begin{equation*} R_{TL} z_T = Q_L^H b - R_{TR} z_B. \end{equation*}

A convenient choice is \(z_B = 0 \) so that \(z_T \) solves

\begin{equation*} R_{TL} z_T = Q_L^H b. \end{equation*}

Regardless of choice of \(z_B \text{,}\) the solution \(\hat x \) is given by

\begin{equation*} \hat x = P( p )^T \left( \begin{array}{c} R_{TL}^{-1} ( Q_L^H b - R_{TR} z_B )\\ \hline z_B \end{array} \right). \end{equation*}

(a permutation of vector \(z \text{.}\)) This defines an infinite number of solutions if \(\rank( A ) \lt n \text{.}\)

The problem is that we don't know which columns are linearly independent in advance. In enrichments in Subsection 4.5.1 and Subsection 4.5.2, rank-revealing QR factorization algorithms are discussed that overcome this problem.