next up previous contents
Next: 1.5.2 Rank-1 update Up: 1.5 Implementation of Basic Previous: 1.5 Implementation of Basic

1.5.1 Matrix-vector multiplication

    The basic operation to be performed is given by A x = y .

tex2html_wrap_inline12553 , x and y distributed like vectors: For this case, assume that x and y are identically distributed according to the inducing vector distribution that induced the distribution of matrix A . Notice that by spreading vector x within columns, we duplicate all necessary elements of x so that local matrix vector multiplication can commence on each node. After this, a reduction (summation) within rows of nodes of the local partial results yields the desired vector y . However, since only a portion of vector y needs to be known to each node, a     distributed reduction (MPI_Reduce_scatter) within rows of nodes suffices. This process is illustrated in Figure 1.5. In this figure, the matrix tex2html_wrap_inline12573 denotes the sub-matrix of A assigned to node (i,j) .

In general,

displaymath12547

After spreading the sub-vectors of x within columns of nodes, node (i,j) holds the following sub-vectors:

displaymath12548

Thus, all sub-vectors of x required for the local matrix-vector multiply are in place. After executing the local matrix-vector multiply, each node owns a local contribution to part of y , so that a summation of the results within rows of nodes completes the matrix-vector multiply, leaving the appropriate piece of the result vector on each node, We will see that this summation within one dimension of the mesh becomes a basic operation in PLAPACK, in Chapter gif.

tex2html_wrap_inline12587 , matrix row x and matrix column y : Again, we wish to perform A x = y , but this time we assume that x and y are a row and column of a matrix, respectively, where the distribution of that matrix is induced by the same inducing vector distribution as that of matrix A . Notice that by spreading (broadcasting) matrix row x within columns, we duplicate all necessary elements of x so that local matrix vector multiplication can commence on each node. After this, a summation within rows of nodes of the local partial results yields the desired vector y . Since y is a column, existing on only one column of nodes, a summation to one node (MPI_Reduce) within each row of nodes can be utilized.
tex2html_wrap_inline12609 , matrix column x and matrix row y : Now we assume that x and y are a column and row of a matrix, respectively, where the distribution of that matrix is induced by the same inducing vector distribution as that of matrix A . Notice that by spreading matrix column x within rows of nodes, we duplicate all necessary elements of x so that local matrix vector multiplication can commence on each node. After this, a summation within rows of nodes (MPI_Reduce_scatter) must occur, leaving the result distributed like the inducing vector. The final operation is to redistribute (gather) the result to the row of the target matrix.


next up previous contents
Next: 1.5.2 Rank-1 update Up: 1.5 Implementation of Basic Previous: 1.5 Implementation of Basic

rvdg@cs.utexas.edu