## Unit1.3.6Matrix-matrix multiplication via matrix-vector multiplications

Now that we are getting comfortable with partitioning matrices and vectors, we can view the six algorithms for $C := A B + C$ in a more layered fashion. If we partition $C$ and $B$ by columns, we find that

\begin{equation*} \begin{array}{rcl} \left( \begin{array}{c | c | c | c } c_0 \amp c_1 \amp \cdots \amp c_{n-1} \end{array} \right) \amp:=\amp A \left( \begin{array}{c | c | c | c } b_0 \amp b_1 \amp \cdots \amp b_{n-1} \end{array} \right) + \left( \begin{array}{c | c | c | c } c_0 \amp c_1 \amp \cdots \amp c_{n-1} \end{array} \right) \\ \amp = \amp \left( \begin{array}{c | c | c | c } A b_0 + c_0 \amp A b_1 + c_1 \amp \cdots \amp A b_{n-1} + c_{n-1} \end{array} \right) \end{array} \end{equation*}

A picture that captures this is given by

This illustrates how the JIP and JPI algorithms can be layered as a loop around matrix-vector multiplications, which itself can be layered as a loop around a dot products or axpy operations:

\begin{equation*} \begin{array}{l} {\bf for~} j := 0, \ldots , n-1 \\[0.15in] ~~~ \left. \begin{array}{l} {\bf for~} p := 0, \ldots , k-1 \\[0.15in] ~~~ ~~~ \left. \begin{array}{l} {\bf for~} i := 0, \ldots , m-1 \\ ~~~ ~~~ \gamma_{i,j} := \alpha_{i,p} \beta_{p,j} + \gamma_{i,j} \\ {\bf end} \end{array} \right\} ~~~ \begin{array}[t]{c} \underbrace{c_j := \beta_{p,j} a_p + c_j}\\ \mbox{axpy} \end{array} \\[0.2in] ~~~ {\bf end} \end{array} \right\} ~~~ \begin{array}[t]{c} \underbrace{c_j := A b_j + c_j}\\ \mbox{mv mult} \end{array} \\[0.2in] {\bf end} \end{array} \end{equation*}

and

\begin{equation*} \begin{array}{l} {\bf for~} j := 0, \ldots , n-1 \\[0.15in] ~~~ \left. \begin{array}{l} {\bf for~} i := 0, \ldots , m-1 \\[0.15in] ~~~ ~~~ \left. \begin{array}{l} {\bf for~} p := 0, \ldots , k-1 \\ ~~~ ~~~ \gamma_{i,j} := \begin{array}[t]{c} \underbrace{\alpha_{i,p} \beta_{p,j} + \gamma_{i,j}}\\ \mbox{axpy} \end{array} \\ {\bf end} \end{array} \right\} ~~~ \gamma_{i,j} := \widetilde a_i^T b_j + \gamma_{i,j} \\[0.2in] ~~~ {\bf end} \end{array} \right\} ~~~ \begin{array}[t]{c} \underbrace{c_j := A b_j + c_j}\\ \mbox{mv mult} \end{array} \\[0.2in] {\bf end} \end{array} \end{equation*}
###### Remark1.3.6.

The file name Gemm_J_Gemv.c can be decoded as follows:

• The Gemm stands for "GEneral Matrix-Matrix multiplication."

• The J indicates a loop indexed by $j$ (a loop over columns of matrix $C$).

• The Gemv indicates that the operation performed in the body of the loop is matrix-vector multiplication.

Similarly, J_Gemv_I_Dots means that the implementation of matrix-matrix multiplication being compiled and executed consists of a loop indexed by $j$ (a loop over the columns of matrix $C$) which itself calls a matrix-vector multiplication that is implemented as a loop indexed by $i$ (over the rows of the matrix with which the matrix-vector multiplication is performed), with a dot product in the body of the loop.

###### Remark1.3.7.

Hopefully you are catching on to our naming convention. It is a bit adhoc and we are probably not completely consistent. They give some hint as to what the implementation looks like.

Going forward, we refrain from further explaining the convention.