Subsection 2.4 Question 4
Question 4.
Compute \(\left( \begin{array}{r r r} 1 \amp -2 \amp 2 \\ -2 \amp 1 \amp -1\\ 2 \amp -1 \amp 1 \end{array} \right) \left( \begin{array}{r} 1 \\ 0 \\ -2 \end{array} \right) = \)
Compute \(\left( \begin{array}{r r r} 1 \amp -2 \amp 2 \\ -2 \amp 1 \amp -1\\ 2 \amp -1 \amp 1 \end{array} \right) \left( \begin{array}{r} -1 \\ 1 \\ 0 \end{array} \right) = \)
Compute \(\left( \begin{array}{r r r} 1 \amp -2 \amp 2 \\ -2 \amp 1 \amp -1\\ 2 \amp -1 \amp 1 \end{array} \right) \left( \begin{array}{r r} 1 \amp -1\\ 0 \amp 1 \\ -2 \amp 0 \end{array} \right) = \)
-
Why is matrix-matrix multiplication defined the way it is? In other words, if
\begin{equation*} A = \left( \begin{array}{c c c c} \alpha_{0,0} \amp \alpha_{0,1} \amp \cdots \amp \alpha_{0,k-1} \\ \alpha_{1,0} \amp \alpha_{1,1} \amp \cdots \amp \alpha_{1,k-1} \\ \vdots \amp \vdots \amp \amp \vdots \\ \alpha_{m-1,0} \amp \alpha_{m-1,1} \amp \cdots \amp \alpha_{m-1,k-1} \end{array} \right), B = \left( \begin{array}{c c c c} \beta_{0,0} \amp \beta_{0,1} \amp \cdots \amp \beta_{0,n-1} \\ \beta_{1,0} \amp \beta_{1,1} \amp \cdots \amp \beta_{1,n-1} \\ \vdots \amp \vdots \amp \amp \vdots \\ \beta_{k-1,0} \amp \beta_{k-1,1} \amp \cdots \amp \beta_{k-1,n-1} \end{array} \right), \end{equation*}and
\begin{equation*} C = \left( \begin{array}{c c c c} \gamma_{0,0} \amp \gamma_{0,1} \amp \cdots \amp \gamma_{0,n-1} \\ \gamma_{1,0} \amp \gamma_{1,1} \amp \cdots \amp \gamma_{1,n-1} \\ \vdots \amp \vdots \amp \amp \vdots \\ \gamma_{m-1,0} \amp \gamma_{m-1,1} \amp \cdots \amp \gamma_{m-1,n-1} \end{array} \right) \end{equation*}and \(C = A B \) then
\begin{equation*} \gamma_{i,j} = \sum_{p=0}^{k-1} \alpha_{i,p} \beta_{p,j}. \end{equation*}
Answer
\(\displaystyle \left( \begin{array}{r r r} 1 \amp -2 \amp 2 \\ -2 \amp 1 \amp -1\\ 2 \amp -1 \amp 1 \end{array} \right) \left( \begin{array}{r} 1 \\ 0 \\ -2 \end{array} \right) = \left( \begin{array}{r} -3 \\ 0 \\ 0 \end{array} \right) \)
\(\displaystyle \left( \begin{array}{r r r} 1 \amp -2 \amp 2 \\ -2 \amp 1 \amp -1\\ 2 \amp -1 \amp 1 \end{array} \right) \left( \begin{array}{r} -1 \\ 1 \\ 0 \end{array} \right) = \left( \begin{array}{r} -3 \\ -3 \\ -3 \end{array} \right) \)
\(\displaystyle \left( \begin{array}{r r r} 1 \amp -2 \amp 2 \\ -2 \amp 1 \amp -1\\ 2 \amp -1 \amp 1 \end{array} \right) \left( \begin{array}{r r} 1 \amp -1\\ 0 \amp 1 \\ -2 \amp 0 \end{array} \right) = \left( \begin{array}{r r} -3 \amp -3\\ 0 \amp -3 \\ 0 \amp -3 \end{array} \right) \)
-
Why is matrix-matrix multiplication defined the way it is?
\(C \) is matrix that represents the linear transformation that equals the composition of the linear transformations represented by \(A \) and \(B \text{.}\) That is, if \(L_A( y ) = A y \) and \(L_B ( x ) = B x \text{,}\) then \(L_C( x ) = L_A ( L_B ( x ) ) = C x \text{.}\)
Solution
\(\displaystyle \left( \begin{array}{r r r} 1 \amp -2 \amp 2 \\ -2 \amp 1 \amp -1\\ 2 \amp -1 \amp 1 \end{array} \right) \left( \begin{array}{r} 1 \\ 0 \\ -2 \end{array} \right) = \left( \begin{array}{r} -3 \\ 0 \\ 0 \end{array} \right) \)
\(\displaystyle \left( \begin{array}{r r r} 1 \amp -2 \amp 2 \\ -2 \amp 1 \amp -1\\ 2 \amp -1 \amp 1 \end{array} \right) \left( \begin{array}{r} -1 \\ 1 \\ 0 \end{array} \right) = \left( \begin{array}{r} -3 \\ -3 \\ -3 \end{array} \right) \)
-
\(\left( \begin{array}{r r r} 1 \amp -2 \amp 2 \\ -2 \amp 1 \amp -1\\ 2 \amp -1 \amp 1 \end{array} \right) \left( \begin{array}{r | r} 1 \amp -1\\ 0 \amp 1 \\ -2 \amp 0 \end{array} \right) = \left( \begin{array}{r | r} -3 \amp -3\\ 0 \amp -3 \\ 0 \amp -3 \end{array} \right) .\)
Notice that, given the computations in the first two parts, you should not have to perform any computations for this part. The reason is that if
\begin{equation*} C = A B, \end{equation*}and we partition \(C \) and \(B \) by rows,
\begin{equation*} C = \left( \begin{array}{c | c | c | c} c_0 \amp c_1 \amp \cdots \amp c_{n-1} \end{array} \right) \mbox{ and } B = \left( \begin{array}{c | c | c | c} b_0 \amp b_1 \amp \cdots \amp b_{n-1} \end{array} \right) , \end{equation*}then
\begin{equation*} \begin{array}{rcl} \left( \begin{array}{c | c | c | c} c_0 \amp c_1 \amp \cdots \amp c_{n-1} \end{array} \right) \amp = \amp A \left( \begin{array}{c | c | c | c} b_0 \amp b_1 \amp \cdots \amp b_{n-1} \end{array} \right) \\ \amp = \amp \left( \begin{array}{c | c | c | c} A b_0 \amp A b_1 \amp \cdots \amp A b_{n-1} \end{array} \right) . \end{array} \end{equation*}Thus, one can simply copy over the results from the first two parts into the columns of the result for this part.
-
Why is matrix-matrix multiplication defined the way it is?
An acceptable short answer would be to say that \(C \) equals the composition of the linear transformations represented by matrices \(A \) and \(B \text{,}\) which explains why \(C \) is computed the way it is.
-
The long answer:
Let \(L_A( x ) = A x \) and \(L_B = B x \) be the linear transformations that are represented by matrices \(A \) and \(B \text{,}\) respectively. It can be shownthat the composition of \(L_A\) and \(L_B \text{,}\) defined by
\begin{equation*} L_C( x ) = L_A( L_B ( x ) ), \end{equation*}is a linear transformation.
Let \(C \) equal the matrix that represents \(L_C \text{,}\) \(e_j \) the \(j\)th standard basis vector (the column of the identity indexed with \(j \)), and partition
\begin{equation*} C = \left( \begin{array}{c | c | c | c} c_0 \amp c_1 \amp \cdots \amp c_{n-1} \end{array} \right) \mbox{ and } B = \left( \begin{array}{c | c | c | c} b_0 \amp b_1 \amp \cdots \amp b_{n-1} \end{array} \right) . \end{equation*}Then we know that
\begin{equation*} c_j = L_A( L_B( e_j ) ) = A( B e_j ) = A b_j. \end{equation*}Remembering how matrix-vector multiplication is computed, this means that
\begin{equation*} \gamma_{i,j} = \sum_{p=0}^{k-1} \alpha_{i,p} \beta_{p,j} = \widetilde a_i^T b_i. \end{equation*}This notation captures that element \(i,j \) of \(C \) equals the dot product of, \(\widetilde a_i^T \text{,}\) the row of \(A \) indexed with \(i \text{,}\) and \(b_j \text{,}\) the column of \(B \) indexed with \(j \text{.}\)
Relevance
Almost every algorithm in linear algebra can be described as a sequence of matrix-matrix multiplications. Hence, understanding how to look at matrix-matrix multiplication in different ways is of utmost importance.
One way is the way you were likely first introduced to matrix-matrix multiplication, where each element of \(C \) is computed via a dot product of the appropriate row of \(A \) with the appropriate column of \(B \text{.}\)
Another is to compute columns of \(C \) as the product of \(A \) times the corresponding column of \(B \text{.}\) This is more directly related to why matrix-matrix multiplication is define the way it is.
We will see more ways of looking at matrix-matrix multiplication in subsequent questions.
More generally, understanding how to multiply partitioned matrices is of central importance to the elegant, practical derivation and presentation of algorithms.
Resources
A learner who deduces from this exercise that they need a refresher on matrix-matrix multiplication may wish to spend some time with Week 4 and Week 5 of LAFF.