ALAFF Permutation matrices

Subsection 5.3.2 Permutation matrices

Recall that we already discussed permutation in Subsection 4.4.4 in the setting of column pivoting when computing the QR factorization.

Definition 5.3.2.1.

Given

\begin{equation*} p = \left( \begin{array}{c} \pi_0 \\ \vdots \\ \pi_{n-1} \end{array} \right), \end{equation*}

where \(\{ \pi_0, \pi_1, \ldots , \pi_{n-1} \} \) is a permutation (rearrangement) of the integers \(\{ 0 , 1, \ldots , n-1 \} \text{,}\) we define the permutation matrix \(P( p )\) by

\begin{equation*} P( p ) = \left( \begin{array}{c} e_{\pi_0}^T \\ \vdots \\ e_{\pi_{n-1}}^T \end{array} \right). \end{equation*}

Homework 5.3.2.1.

Let

\begin{equation*} p = \left( \begin{array}{c} \pi_0 \\ \vdots \\ \pi_{n-1} \end{array} \right) \mbox{ and } x = \left( \begin{array}{c} \chi_0 \\ \vdots \\ \chi_{n-1} \end{array} \right). \end{equation*}

Evaluate \(P( p ) x \text{.}\)

Solution

\begin{equation*} \begin{array}{l} P( p ) x \\ ~~~=~~~~ \lt \mbox{ definition } \gt \\ \left( \begin{array}{c} e_{\pi_0}^T \\ \vdots \\ e_{\pi_{n-1}}^T \end{array} \right) x \\ ~~~=~~~~ \lt \mbox{ matrix-vector multiplication by rows } \gt \\ \left( \begin{array}{c} e_{\pi_0}^Tx \\ \vdots \\ e_{\pi_{n-1}}^Tx \end{array} \right) \\ ~~~=~~~~ \lt e_j^T x = x_j \gt \\ \left( \begin{array}{c} \chi_{\pi_0} \\ \vdots \\ \chi_{\pi_{n-1}} \end{array} \right) \end{array} \end{equation*}

The last homework shows that applying \(P( p ) \) to a vector \(x \) rearranges the elements of that vector according to the permutation indicated by the vector \(p \text{.}\)

Homework 5.3.2.2.

Let

\begin{equation*} p = \left( \begin{array}{c} \pi_0 \\ \vdots \\ \pi_{n-1} \end{array} \right) \mbox{ and } A = \left( \begin{array}{c} \widetilde a_0^T \\ \vdots \\ \widetilde a_{n-1}^T \end{array} \right). \end{equation*}

Evaluate \(P( p ) A \text{.}\)

Solution

\begin{equation*} \begin{array}{l} P( p ) A \\ ~~~=~~~~ \lt \mbox{ definition } \gt \\ \left( \begin{array}{c} e_{\pi_0}^T \\ \vdots \\ e_{\pi_{n-1}}^T \end{array} \right) A \\ ~~~=~~~~ \lt \mbox{ matrix-matrix multiplication by rows } \gt \\ \left( \begin{array}{c} e_{\pi_0}^TA \\ \vdots \\ e_{\pi_{n-1}}^TA \end{array} \right) \\ ~~~=~~~~ \lt e_j^T A = \widetilde a_j^T \gt \\ \left( \begin{array}{c} \widetilde a_{\pi_0}^T \\ \vdots \\ \widetilde a_{\pi_{n-1}}^T \end{array} \right) \end{array} \end{equation*}

The last homework shows that applying \(P( p ) \) to a matrix \(A \) rearranges the rows of that matrix according to the permutation indicated by the vector \(p \text{.}\)

Homework 5.3.2.3.

Let

\begin{equation*} p = \left( \begin{array}{c} \pi_0 \\ \vdots \\ \pi_{n-1} \end{array} \right) \mbox{ and } A = \left( \begin{array}{c c c} a_0 \amp \cdots \amp a_{n-1} \end{array} \right). \end{equation*}

Evaluate \(A P(p)^T \text{.}\)

Solution

\begin{equation*} \begin{array}{l} A P( p )^T \\ ~~~=~~~~ \lt \mbox{ definition } \gt \\ A \left( \begin{array}{c} e_{\pi_0}^T \\ \vdots \\ e_{\pi_{n-1}}^T \end{array} \right)^T \\ ~~~=~~~~ \lt \mbox{ transpose } P( p ) \gt \\ A \left( \begin{array}{c c c} e_{\pi_0} \amp \cdots \amp e_{\pi_{n-1}} \end{array} \right) \\ ~~~=~~~~ \lt \mbox{ matrix-matrix multiplication by columns } \gt \\ \left( \begin{array}{c c c} A e_{\pi_0} \amp \cdots \amp A e_{\pi_{n-1}} \end{array} \right) \\ ~~~=~~~~ \lt A e_j = a_j \gt \\ \left( \begin{array}{c c c} a_{\pi_0} \amp \cdots \amp a_{\pi_{n-1}} \end{array} \right) \end{array} \end{equation*}

The last homework shows that applying \(P( p )^T \) from the right to a matrix \(A \) rearranges the columns of that matrix according to the permutation indicated by the vector \(p \text{.}\)

Homework 5.3.2.4.

Evaluate \(P( p ) P( p )^T \text{.}\)

Answer

\(P( p ) P( p )^T = I \)

Solution

\begin{equation*} \begin{array}{l} P(p) P( p )^T \\ ~~~=~~~~ \lt \mbox{ definition } \gt \\ \left( \begin{array}{c} e_{\pi_0}^T \\ \vdots \\ e_{\pi_{n-1}}^T \end{array} \right) \left( \begin{array}{c} e_{\pi_0}^T \\ \vdots \\ e_{\pi_{n-1}}^T \end{array} \right)^T \\ ~~~=~~~~ \lt \mbox{ transpose } P( p ) \gt \\ \left( \begin{array}{c} e_{\pi_0}^T \\ \vdots \\ e_{\pi_{n-1}}^T \end{array} \right) \left( \begin{array}{c c c} e_{\pi_0} \amp \cdots \amp e_{\pi_{n-1}} \end{array} \right) \\ ~~~=~~~~ \lt \mbox{ evaluate } \gt \\ \left( \begin{array}{c c c c} e_{\pi_0}^T e_{\pi_0} \amp e_{\pi_0}^T e_{\pi_1} \amp \cdots \amp e_{\pi_0}^T e_{\pi_{n-1}} \\ e_{\pi_1}^T e_{\pi_0} \amp e_{\pi_1}^T e_{\pi_1} \amp \cdots \amp e_{\pi_1}^T e_{\pi_{n-1}} \\ \vdots \amp \vdots \amp \amp \vdots \\ e_{\pi_{n-1}}^T e_{\pi_0} \amp e_{\pi_{n-1}}^T e_{\pi_1} \amp \cdots \amp e_{\pi_{n-1}}^T e_{\pi_{n-1}} \\ \end{array} \right) \\ ~~~=~~~~ \lt e_i^T e_j = \cdots \gt \\ \left( \begin{array}{c c c c} 1 \amp 0 \amp \cdots \amp 0 \\ 0 \amp 1 \amp \cdots \amp 0 \\ \vdots \amp \vdots \amp \amp \vdots \\ 0 \amp 0 \amp \cdots \amp 1 \end{array} \right) \end{array} \end{equation*}

We will see that when discussing the LU factorization with partial pivoting, a permutation matrix that swaps the first element of a vector with the \(\pi\)-th element of that vector is a fundamental tool.

Definition 5.3.2.2. Elementary pivot matrix.

Given \(\pi \in \{ 0, \ldots , n-1 \} \) define the elementary pivot matrix

\begin{equation*} \widetilde P( \pi ) = \left( \begin{array}{c} e_\pi^T \\ \hline e_1^T \\ \vdots \\ e_{\pi-1}^T \\ \hline e_0^T \\ \hline e_{\pi+1}^T \\ \vdots \\ e_{n-1}^T \end{array} \right) \end{equation*}

or, equivalently,

\begin{equation*} \widetilde P( \pi ) = \left\{ \begin{array}{c l} I_n \amp \mbox{if } \pi = 0 \\ \left( \begin{array}{c | c | c | c} 0 \amp 0 \amp 1 \amp0 \\ \hline 0 \amp I_{\pi-1}\amp 0 \amp 0\\ \hline 1 \amp0 \amp0 \amp0 \\ \hline 0 \amp 0 \amp 0\amp I_{n-\pi-1} \end{array} \right) \amp \mbox{otherwise}, \end{array} \right. \end{equation*}

where \(n \) is the size of the permutation matrix.

When \(\tilde P( \pi ) \) is applied to a vector, it swaps the top element with the element indexed with \(\pi \text{.}\) When it is applied to a matrix, it swaps the top row with the row indexed with \(\pi \text{.}\) The size of matrix \(\tilde P( \pi ) \) is determined by the size of the vector or the row size of the matrix to which it is applied.

In discussing LU factorization with pivoting, we will use elementary pivot matrices in a very specific way, which necessitates the definition of how a sequence of such pivots are applied. Let \(p\) be a vector of integers satisfying the conditions

\begin{equation} p = \left( \begin{array}{c} \pi_0 \\ \vdots \\ \pi_{k-1} \end{array} \right), \quad \mbox{ where } \quad 1 \leq k \leq n \mbox{ and } 0 \leq \pi_i \lt n-i,\label{chapter05-pivot-eqn-cond}\tag{5.3.1} \end{equation}

then \(\widetilde P( p ) \) will denote the sequence of pivots

\begin{equation*} % \setlength{\arraycolsep}{2pt} \widetilde P( p ) = \left( \begin{array}{c c} I_{k-1} \amp 0 \\ 0 \amp \widetilde P( \pi_{k-1} ) \end{array} \right) \left( \begin{array}{c c} I_{k-2} \amp 0 \\ 0 \amp \widetilde P( \pi_{k-2} ) \end{array} \right) \cdots \left( \begin{array}{c c} 1 \amp 0 \\ 0 \amp \widetilde P( \pi_1 ) \end{array} \right) \widetilde P( \pi_0 ). \end{equation*}

(Here \(\widetilde P( \cdot ) \) is always an elementary pivot matrix "of appropriate size.") What this exactly does is best illustrated through an example:

Example 5.3.2.3.

Let

\begin{equation*} p = \left( \begin{array}{c} 2 \\ 1 \\ 1 \end{array} \right) \quad \mbox{and} \quad A = \left( \begin{array}{l l l l} 0.0 \amp 0.1 \amp 0.2 \\ 1.0 \amp 1.1 \amp 1.2 \\ 2.0 \amp 2.1 \amp 2.2 \\ 3.0 \amp 3.1 \amp 3.2 \end{array} \right). \end{equation*}

Evaluate \(\widetilde P( p ) A \text{.}\)

Solution

\begin{equation*} \begin{array}{l} \widetilde P( p ) A \\ ~~~=~~~~ \lt \mbox{ instantiate } \gt \\ \widetilde P( \left( \begin{array}{c} 2 \\ 1 \\ 1 \end{array} \right) ) \left( \begin{array}{l l l l} 0.0 \amp 0.1 \amp 0.2 \\ 1.0 \amp 1.1 \amp 1.2 \\ 2.0 \amp 2.1 \amp 2.2 \\ 3.0 \amp 3.1 \amp 3.2 \end{array} \right) \\ ~~~=~~~~ \lt \mbox{ definition of } \widetilde P( \cdot ) \gt \\ \left( \begin{array}{c | c} 1 \amp 0 \\ \hline 0 \amp \widetilde P( \left( \begin{array}{c} 1 \\ 1 \end{array} \right) ) \end{array} \right) \widetilde P( 2 ) \left( \begin{array}{l l l l} 0.0 \amp 0.1 \amp 0.2 \\ 1.0 \amp 1.1 \amp 1.2 \\ 2.0 \amp 2.1 \amp 2.2 \\ 3.0 \amp 3.1 \amp 3.2 \end{array} \right) \\ ~~~=~~~~ \lt \mbox{ swap first row with row indexed with 2 } \gt \\ \left( \begin{array}{c | c} 1 \amp 0 \\ \hline 0 \amp \widetilde P( \left( \begin{array}{c} 1 \\ 1 \end{array} \right) ) \end{array} \right) \left( \begin{array}{l l l l} 2.0 \amp 2.1 \amp 2.2 \\ \hline 1.0 \amp 1.1 \amp 1.2 \\ 0.0 \amp 0.1 \amp 0.2 \\ 3.0 \amp 3.1 \amp 3.2 \end{array} \right) \\ ~~~=~~~~ \lt \mbox{ partitioned matrix-matrix multiplication } \gt \\ \left( \begin{array}{c} \left( \begin{array}{l l l } 2.0 \amp 2.1 \amp 2.2 \end{array} \right)\\ \hline \widetilde P( \left( \begin{array}{c} 1 \\ 1 \end{array} \right) ) \left( \begin{array}{l l l } 1.0 \amp 1.1 \amp 1.2 \\ 0.0 \amp 0.1 \amp 0.2 \\ 3.0 \amp 3.1 \amp 3.2 \end{array} \right) \end{array} \right) \\ ~~~=~~~~ \lt \mbox{ swap current first row with row indexed with 1 relative to that row } \gt \\ \left( \begin{array}{c} \left( \begin{array}{l l l } 2.0 \amp 2.1 \amp 2.2 \\ 0.0 \amp 0.1 \amp 0.2 \\ \end{array} \right)\\ \hline \widetilde P( \left( \begin{array}{c} 1 \end{array} \right) ) \left( \begin{array}{l l l } 1.0 \amp 1.1 \amp 1.2 \\ 3.0 \amp 3.1 \amp 3.2 \end{array} \right) \end{array} \right) \\ ~~~=~~~~ \lt \mbox{ swap current first row with row indexed with 1 relative to that row } \gt \\ \left( \begin{array}{c} \left( \begin{array}{l l l } 2.0 \amp 2.1 \amp 2.2 \\ 0.0 \amp 0.1 \amp 0.2 \\ 3.0 \amp 3.1 \amp 3.2 \end{array} \right)\\ \hline \left( \begin{array}{l l l } 1.0 \amp 1.1 \amp 1.2 \\ \end{array} \right) \end{array} \right) \\ ~~~=~~~~ \\ \left( \begin{array}{l l l } 2.0 \amp 2.1 \amp 2.2 \\ 0.0 \amp 0.1 \amp 0.2 \\ 3.0 \amp 3.1 \amp 3.2 \\ 1.0 \amp 1.1 \amp 1.2 \\ \end{array} \right) \end{array} \\ \end{equation*}

The relation between \(\widetilde P ( \cdot ) \) and \(P( \cdot ) \) is tricky to specify:

\begin{equation*} \widetilde P( \left( \begin{array}{c} \pi_0 \\ \pi_1 \\ \vdots \\ \pi_{k-1} \end{array} \right) ) = P( \left( \begin{array}{c c} I_{k-1} \amp 0 \\ 0 \amp \widetilde P( \pi_{k-1} ) \end{array} \right) \cdots \left( \begin{array}{c c} 1 \amp 0 \\ 0 \amp \widetilde P( \pi_1 ) \end{array} \right) \widetilde P( \pi_0 ) \left( \begin{array}{c} 0\\ 1 \\ \vdots \\ k-1 \end{array} \right) ). \end{equation*}