LAFF Computing the matrix-matrix product

Subsection 4.4.3 Computing the matrix-matrix product

Homework 4.4.3.1.

Compute

\begin{equation*} Q = P \times P = \begin{MatrixR} 0.4 \amp 0.3 \amp 0.1 \\ 0.4 \amp 0.3 \amp 0.6 \\ 0.2 \amp 0.4 \amp 0.3 \end{MatrixR} \begin{MatrixR} 0.4 \amp 0.3 \amp 0.1 \\ 0.4 \amp 0.3 \amp 0.6 \\ 0.2 \amp 0.4 \amp 0.3 \end{MatrixR} = \begin{MatrixR} 0.30 \amp 0.25 \amp 0.25 \\ 0.40 \amp 0.45 \amp 0.40 \\ 0.30 \amp 0.30 \amp 0.35 \end{MatrixR}. \end{equation*}

Homework 4.4.3.2.

Let \(A = \left( \begin{array}{r r r r} 2 \amp 0 \amp 1 \\ -1 \amp 1 \amp 0 \\ 1 \amp 3 \amp 1 \\ -1 \amp 1 \amp 1 \end{array} \right)\) and \(B = \left( \begin{array}{r r r r} 2 \amp 1 \amp 2 \amp 1 \\ 0 \amp 1 \amp 0 \amp 1 \\ 1 \amp 0 \amp 1 \amp 0 \end{array} \right) \text{.}\) Compute

\(A B = \)
\(B A = \)

Solution

\(A B = \left( \begin{array}{r r r r} 5 \amp 2 \amp 5 \amp 2\\ -2 \amp 0 \amp -2 \amp 0 \\ 3 \amp 4 \amp 3 \amp 4 \\ -1 \amp 0 \amp -1 \amp 0 \\ \end{array} \right).\)
\(B A = \left( \begin{array}{r r r r} 4 \amp 8 \amp 5 \\ -2 \amp 2 \amp 1 \\ 3 \amp 3 \amp 2 \end{array} \right).\)

Homework 4.4.3.3.

Let \(A \in \mathbb{R}^{m \times k} \) and \(B \in \mathbb{R}^{k \times n} \) and \(A B = B A \text{.}\)

ALWAYS/SOMETIMES/NEVER: \(A \) and \(B \) are square matrices.

Answer

ALWAYS

Why?

Solution

The result of \(A B \) is a \(m \times n \) matrix. The result of \(B A \) is a \(k \times k \) matrix. Hence \(m = k \) and \(n = k \text{.}\) In other words, \(m = n = k \text{.}\)

Homework 4.4.3.4.

Let \(A \in \mathbb{R}^{m \times k} \) and \(B \in \mathbb{R}^{k \times n} \text{.}\)

ALWAYS/SOMETIMES/NEVER:

\begin{equation*} A B = B A . \end{equation*}

Answer

SOMETIMES

Why?

Solution

If \(m \neq n \) then \(B A \) is not even defined because the sizes of the matrices don't match up. But if \(A \) is square and \(A = B \text{,}\) then clearly \(A B = A A = B A \text{.}\)

So, there are examples where the statement is true and examples where the statement is false.

Homework 4.4.3.5.

Let \(A, B \in \mathbb{R}^{n \times n} \text{.}\)

ALWAYS/SOMETIMES/NEVER: \(A B = B A \text{.}\)

Answer

SOMETIMES

Why?

Solution

Almost any random matrices \(A \) and \(B \) will have the property that \(A B \neq B A \text{.}\) But if you pick, for example, \(n = 1 \) or \(A = I \) or \(A = 0 \) or \(A = B \text{,}\) then \(A B = B A \text{.}\) There are many other examples.

The bottom line: Matrix multiplication, unlike scalar multiplication, does not necessarily commute.

Homework 4.4.3.6.

\(A^2 \) is defined as \(A A \text{.}\) Similarly \(A^k = \begin{array}[t]{c} \underbrace{A A \cdots A} \\ \mbox{\) k \(occurrences of \) A \(} \end{array} \text{.}\) Consistent with this, \(A^0 = I \) so that \(A^k = A^{k-1} A \) for \(k \gt 0 \text{.}\)

TRUE/FALSE: \(A^k \) is well-defined only if \(A \) is a square matrix.

Answer

TRUE

Why?

Solution

Just check the sizes of the matrices.

Homework 4.4.3.7.

Let \(A, B, C \) be matrix "of appropriate size" so that \(( A B ) C\) is well defined. \(A ( B C ) \) is well defined. \\ \mbox{~} \hfill Always/Sometimes/Never

Answer

ALWAYS

Why?

Solution

For \(( A B) C \) to be well defined, \(A \in \mathbb{R}^{m_A \times n_A} \text{,}\) \(B \in \mathbb{R}^{m_B \times n_B} \text{,}\) \(C \in \mathbb{R}^{m_C \times n_C} \text{,}\) where \(n_A = m_B \) and \(n_B = m_C \text{.}\) But then \(B C \) is well defined because \(n_B = m_C \) and results in a \(m_B \times n_C \) matrix. But then \(A ( B C ) \) is well defined because \(n_A = m_B \text{.}\)

The question now becomes how to compute \(C \) given matrices \(A \) and \(B\text{.}\) For this, we are going to use and abuse the standard basis vectors \(e_j \text{.}\)

Consider the following. Let

\(C \in \mathbb{R}^{m \times n} \text{,}\) \(A \in \mathbb{R}^{m \times k} \text{,}\) and \(B \in \mathbb{R}^{k \times n} \text{;}\) and
\(C = A B \text{;}\) and
\(L_{C}: \mathbb{R}^n \rightarrow \mathbb{R}^m \) equal the linear transformation such that \(L_C( x ) = C x \text{;}\) and
\(L_{A}: \mathbb{R}^k \rightarrow \mathbb{R}^m \) equal the linear transformation such that \(L_A( x ) = A x \text{.}\)
\(L_{B}: \mathbb{R}^n \rightarrow \mathbb{R}^k \) equal the linear transformation such that \(L_B( x ) = B x \text{;}\) and
\(e_j \) denote the \(j \)th standard basis vector; and
\(c_j \) denote the \(j \)th column of \(C \text{;}\) and
\(b_j \) denote the \(j \)th column of \(B \text{.}\)

Then

\begin{equation*} c_j = C e_j = L_C( e_j ) = L_A( L_B( e_j ) ) = L_A( B e_j ) = L_A( b_j ) = A b_j . \end{equation*}

From this we learn that

Remark 4.4.3.1.

If \(C = A B \) then the \(j \)th column of \(C \text{,}\) \(c_j \text{,}\) equals \(A b_j \text{,}\) where \(b_j \) is the \(j \)th column of \(B \text{.}\)

Since by now you should be very comfortable with partitioning matrices by columns, we can summarize this as

Remark 4.4.3.2.

\begin{equation*} \left( \begin{array}{c | c | c | c } c_0 \amp c_1 \amp \cdots \amp c_{n-1} \end{array} \right) = C = A B = A \left( \begin{array}{c | c | c | c } b_0 \amp b_1 \amp \cdots \amp b_{n-1} \end{array} \right) = \left( \begin{array}{c | c | c | c } A b_0 \amp A b_1 \amp \cdots \amp A b_{n-1} \end{array} \right). \end{equation*}

Now, let's expose the elements of \(C \text{,}\) \(A \text{,}\) and \(B\text{.}\)

\begin{equation*} \begin{array}{c c} C = \left( \begin{array}{c c c c } \gamma_{0,0} \amp \gamma_{0,1} \amp \cdots \amp \gamma_{0,n-1} \\ \gamma_{1,0} \amp \gamma_{1,1} \amp \cdots \amp \gamma_{1,n-1} \\ \vdots \amp \vdots \amp \vdots \amp \vdots \\ \gamma_{m-1,0} \amp \gamma_{m-1,1} \amp \cdots \amp \gamma_{m-1,n-1} \\ \end{array} \right) , \quad A = \left( \begin{array}{c c c c } \alpha_{0,0} \amp \alpha_{0,1} \amp \cdots \amp \alpha_{0,k-1} \\ \alpha_{1,0} \amp \alpha_{1,1} \amp \cdots \amp \alpha_{1,k-1} \\ \vdots \amp \vdots \amp \vdots \amp \vdots \\ \alpha_{m-1,0} \amp \alpha_{m-1,1} \amp \cdots \amp \alpha_{m-1,k-1} \\ \end{array} \right), \\ \mbox{and} \quad B = \left( \begin{array}{c c c c } \beta_{0,0} \amp \beta_{0,1} \amp \cdots \amp \beta_{0,n-1} \\ \beta_{1,0} \amp \beta_{1,1} \amp \cdots \amp \beta_{1,n-1} \\ \vdots \amp \vdots \amp \vdots \amp \vdots \\ \beta_{k-1,0} \amp \beta_{k-1,1} \amp \cdots \amp \beta_{k-1,n-1} \\ \end{array} \right). \end{array} \end{equation*}

Remark 4.4.3.3.

We are going to show that

\begin{equation*} \gamma_{i,j} = \sum_{p=0}^{k-1} \alpha_{i,p} \beta_{p,j} , \end{equation*}

which you may have learned in a high school algebra course.

We reasoned that \(c_j = A b_j \text{:}\)

\begin{equation*} \left( \begin{array}{c c c c } \gamma_{0,j} \\ \gamma_{1,j} \\ \vdots \\ \hline \multicolumn{1}{|c|} {\gamma_{i,j}} \\ \hline \vdots \\ \gamma_{m-1,j} \\ \end{array} \right) = \left( \begin{array}{c c c c } \alpha_{0,0} \amp \alpha_{0,1} \amp \cdots \amp \alpha_{0,k-1} \\ \alpha_{1,0} \amp \alpha_{1,1} \amp \cdots \amp \alpha_{1,k-1} \\ \vdots \amp \vdots \amp \vdots \amp \vdots \\ \hline \multicolumn{1}{|c}{\alpha_{i,0}} \amp \alpha_{i,1} \amp \cdots \amp \multicolumn{1}{c|}{\alpha_{i,k-1}} \\ \hline \vdots \amp \vdots \amp \vdots \amp \vdots \\ \alpha_{m-1,0} \amp \alpha_{m-1,1} \amp \cdots \amp \alpha_{m-1,k-1} \\ \end{array} \right) \left( \begin{array}{|c|} \hline \beta_{0,j} \\ \beta_{1,j} \\ \vdots \\ \beta_{k-1,j} \\ \hline \end{array} \right). \end{equation*}

Here we highlight the \(i\)th element of \(c_j \text{,}\) \(\gamma_{i,j} \text{,}\) and the \(i \)th row of \(A \text{.}\) We recall that the \(i \)th element of \(A x \) equals the dot product of the \(i \)th row of \(A \) with the vector \(x \text{.}\) Thus, \(\gamma_{i,j} \) equals the dot product of the \(i \)th row of \(A \) with the vector \(b_j \text{:}\)

\begin{equation*} \gamma_{i,j} = \sum_{p=0}^{k-1} \alpha_{i,p} \beta_{p,j} . \end{equation*}

Remark 4.4.3.4.

Let \(A \in \mathbb{R}^{m \times k} \text{,}\) \(B \in \mathbb{R}^{k \times n} \text{,}\) and \(C \in \mathbb{R}^{m \times n} \text{.}\) Then the matrix-matrix multiplication (product) \(C = A B \) is computed by

\begin{equation*} \gamma_{i,j} = \sum_{p=0}^{k-1} \alpha_{i,p} \beta_{p,j} = \alpha_{i,0} \beta_{0,j} + \alpha_{i,1} \beta_{1,j} + \cdots + \alpha_{i,k-1} \beta_{k-1,j}. \end{equation*}

As a result of this definition \(C x = A ( B x ) = ( A B ) x \) and can drop the parentheses, unless they are useful for clarity: \(C x = A B x \) and \(C = A B \text{.}\)

Remark 4.4.3.5.

We emphasize that for matrix-matrix multiplication to be a legal operations, the row and column dimensions of the matrices must obey certain constraints. Whenever we talk about dimensions being conformal, we mean that the dimensions are such that the encountered matrix multiplications are valid operations.