next up previous contents
Next: Implementation of the copy Up: A Building Block Approach Previous: Collective communication operations

Efficient implementation of collective communication

 

Over the last few years, we have written a number of papers concerning the efficient implementation of collective communication operations on parallel architectures []. As part of that research, we have noticed that efficient implementations for scatter, gather, collect, and distributed reduction, one can build efficient implementations for the broadcast, reduce-to-one, and reduce-to-all by making the following observations:

Indeed, given optimal implementations of scatter, gather collect, and distributed reduction, implementing the other operations as described can be shown to be asymptotically (for long vectors of data) within a factor two of optimal, or even optimal.


next up previous contents
Next: Implementation of the copy Up: A Building Block Approach Previous: Collective communication operations

rvdg@cs.utexas.edu