that compute \(y := A x + y \) via the algorithms in Figure~\ref{fig-mvmult_alt}.
In the next few units, we will modify the matrix-vector multiplication algorithms from last week so that they can take advantage of matrices with special structure (e.g., triangular or symmetric matrices).
\begin{equation*}
\renewcommand{\routinename}{ \left[ A \right] := \mbox{Algorithm_Skeleton}( A ) }
% Step 3: Loop-guard
\renewcommand{\guard}{
m( A_{TL} ) \lt m( A )
}
% Step 4: Define Initialize
\renewcommand{\partitionings}{
A \rightarrow
\FlaTwoByTwo{A_{TL}}{A_{TR}}
{A_{BL}}{A_{BR}}
}
\renewcommand{\partitionsizes}{
A_{TL} is 0 \times 0
}
% Step 5a: Repartition the operands
\renewcommand{\repartitionings}{
\FlaTwoByTwo{A_{TL}}{A_{TR}}
{A_{BL}}{A_{BR}}
\rightarrow
\FlaThreeByThreeBR{A_{00}}{a_{01}}{A_{02}}
{a_{10}^T}{\alpha_{11}}{a_{12}^T}
{A_{20}}{a_{21}}{A_{22}}
}
\renewcommand{\repartitionsizes}{
\alpha_{11} is 1 \times 1 }
% Step 5b: Move the double lines
\renewcommand{\moveboundaries}{
\FlaTwoByTwo{A_{TL}}{A_{TR}}
{A_{BL}}{A_{BR}}
\leftarrow
\FlaThreeByThreeTL{A_{00}}{a_{01}}{A_{02}}
{a_{10}^T}{\alpha_{11}}{a_{12}^T}
{A_{20}}{a_{21}}{A_{22}}
}
% Step 8: Insert the updates required to change the
% state from that given in Step 6 to that given in Step 7
% Note: The below needs editing!!!
\renewcommand{\update}{
\begin{array}{l @{\hspace{1in}}}
~~\\
~~\\
~~
\end{array}
}
\FlaAlgorithm
\end{equation*}
Figure4.2.3.1. Code skeleton for algorithms when matrices are triangular or symmetric.
What makes a triangular or symmetric matrix special? For one thing, it is square. For another, it only requires one triangle of a matrix to be stored. It was for this reason that we ended up with "algorithm skeletons" that looked like the one in Figure~\ref{fig-skel1} when we presented algorithms for "triangularizing" or "symmetrizing" a matrix.
Now, consider a typical partitioning of a matrix that is encountered in such an algorithm:
then \(a_{01} = 0 \text{,}\) \(A_{02} = 0 \text{,}\) and \(a_{12}^T = 0\text{.}\) (Remember: the "\(0\)" is a matrix or vector "of appropriate size".) If instead the matrix is symmetric with only the lower triangular part stored, then \(a_{01} = \left( a_{10}^T \right)^T = a_{10} \text{,}\) \(A_{02} = A_{20}^T \text{,}\) and \(a_{12}^T = a_{21}^T \text{.}\)
\begin{equation*}
% Define the operation to be computed
\renewcommand{\routinename}{ y := \mbox{Mvmult\_n\_unb\_var1b}( A, x, y ) }
% Step 3: Loop-guard
\renewcommand{\guard}{
m( A_{TL} ) \lt m( A )
}
% Step 4: Define Initialize
\renewcommand{\partitionings}{
A \rightarrow
\FlaTwoByTwo{A_{TL}}{A_{TR}}
{A_{BL}}{A_{BR}}
, \\
x \rightarrow
\FlaTwoByOne{x_{T}}
{x_{B}}
,
y \rightarrow
\FlaTwoByOne{y_{T}}
{y_{B}}
}
\renewcommand{\partitionsizes}{
A_{TL} is 0 \times 0 ,
x_{T} , y_T are 0 \times 1
}
% Step 5a: Repartition the operands
\renewcommand{\repartitionings}{
\FlaTwoByTwo{A_{TL}}{A_{TR}}
{A_{BL}}{A_{BR}}
\rightarrow
\FlaThreeByThreeBR{A_{00}}{a_{01}}{A_{02}}
{a_{10}^T}{\alpha_{11}}{a_{12}^T}
{A_{20}}{a_{21}}{A_{22}}
, \\
\FlaTwoByOne{ x_T }
{ x_B }
\rightarrow
\FlaThreeByOneB{x_0}
{\chi_1}
{x_2}
,
\FlaTwoByOne{ y_T }
{ y_B }
\rightarrow
\FlaThreeByOneB{y_0}
{\psi_1}
{y_2}
}
\renewcommand{\repartitionsizes}{
\alpha_{11} ,
\chi_1 , and
\psi_1 are scalars}
% Step 5b: Move the double lines
\renewcommand{\moveboundaries}{
\FlaTwoByTwo{A_{TL}}{A_{TR}}
{A_{BL}}{A_{BR}}
\leftarrow
\FlaThreeByThreeTL{A_{00}}{a_{01}}{A_{02}}
{a_{10}^T}{\alpha_{11}}{a_{12}^T}
{A_{20}}{a_{21}}{A_{22}}
, \\
\FlaTwoByOne{ x_T }
{ x_B }
\leftarrow
\FlaThreeByOneT{x_0}
{\chi_1}
{x_2}
,
\FlaTwoByOne{ y_T }
{ y_B }
\leftarrow
\FlaThreeByOneT{y_0}
{\psi_1}
{y_2}
}
% Step 8: Insert the updates required to change the
% state from that given in Step 6 to that given in Step 7
% Note: The below needs editing!!!
\renewcommand{\update}{
\begin{array}{l}
\psi_1 \becomes a_{10}^T x_0 + \alpha_{11} \chi_1 + a_{12}^T x_2 + \psi_1
\phantom{ \FlaThreeByOneB{ y_0 }{\psi_1}{y_2}
}
\end{array}
}
\FlaAlgorithm
\end{equation*}
\begin{equation*}
% Define the operation to be computed
\renewcommand{\routinename}{ y := \mbox{Mvmult\_n\_unb\_var2b}( A, x, y ) }
% Step 3: Loop-guard
\renewcommand{\guard}{
m( A_{TL} ) \lt m( A )
}
% Step 4: Define Initialize
\renewcommand{\partitionings}{
A \rightarrow
\FlaTwoByTwo{A_{TL}}{A_{TR}}
{A_{BL}}{A_{BR}}
, \\
x \rightarrow
\FlaTwoByOne{x_{T}}
{x_{B}}
,
y \rightarrow
\FlaTwoByOne{y_{T}}
{y_{B}}
}
\renewcommand{\partitionsizes}{
A_{TL} is 0 \times 0 ,
x_{T} , y_T are 0 \times 1
}
% Step 5a: Repartition the operands
\renewcommand{\repartitionings}{
\FlaTwoByTwo{A_{TL}}{A_{TR}}
{A_{BL}}{A_{BR}}
\rightarrow
\FlaThreeByThreeBR{A_{00}}{a_{01}}{A_{02}}
{a_{10}^T}{\alpha_{11}}{a_{12}^T}
{A_{20}}{a_{21}}{A_{22}}
, \\
\FlaTwoByOne{ x_T }
{ x_B }
\rightarrow
\FlaThreeByOneB{x_0}
{\chi_1}
{x_2}
,
\FlaTwoByOne{ y_T }
{ y_B }
\rightarrow
\FlaThreeByOneB{y_0}
{\psi_1}
{y_2}
}
\renewcommand{\repartitionsizes}{
\alpha_{11} ,
\chi_1 , and
\psi_1 are scalars}
% Step 5b: Move the double lines
\renewcommand{\moveboundaries}{
\FlaTwoByTwo{A_{TL}}{A_{TR}}
{A_{BL}}{A_{BR}}
\leftarrow
\FlaThreeByThreeTL{A_{00}}{a_{01}}{A_{02}}
{a_{10}^T}{\alpha_{11}}{a_{12}^T}
{A_{20}}{a_{21}}{A_{22}}
, \\
\FlaTwoByOne{ x_T }
{ x_B }
\leftarrow
\FlaThreeByOneT{x_0}
{\chi_1}
{x_2}
,
\FlaTwoByOne{ y_T }
{ y_B }
\leftarrow
\FlaThreeByOneT{y_0}
{\psi_1}
{y_2}
}
% Step 8: Insert the updates required to change the
% state from that given in Step 6 to that given in Step 7
% Note: The below needs editing!!!
\renewcommand{\update}{
\begin{array}{l}
y_0 \becomes \chi_1 a_{01} + y_0 \\
\psi_1 \becomes \chi_1 \alpha_{11} + \psi_1 \\
y_2 \becomes \chi_1 a_{21} + y_2
\end{array}
}
\FlaAlgorithm
\end{equation*}
Figure4.2.3.2. Alternative algorithms for matrix-vector multiplication.
The above observation leads us to express the matrix-vector multiplication algorithms for computing \(y := A x + y \) given in Figure~\ref{fig-mvmult_alt}. Note:
For the left algorithm, what was previously the "current" row in matrix \(A \text{,}\) \(a_{1}^T \text{,}\) is now viewed as consisting of three parts:
\begin{equation*}
a_{1}^T = \left( \begin{array}{c I c | c}
a_{10}^T \amp \alpha_{11} \amp a_{12}^T
\end{array}
\right)
\end{equation*}
while the vector \(x \) is now also partitioned into three parts:
Now, for matrix-vector multiplication \(y := A x + y \text{,}\) it is not beneficial to break the computation up in this way. Typically, a dot product is more efficient than multiple operations with the subvectors. Similarly, typically one \axpy\ is more efficient then multiple \axpy s. But the observations in this unit lay the foundation for modifying the algorithms to take advantage of special structure in the matrix, later this week.