Parallelization of Asynchronous Variational Integrators on Distributed Memory Systems

M. Amber Hassaan

In the operator formulation, *ordered*
algorithms are algorithms in which there is an
application-specific order in which activities must appear to
have been executed. Time-dependent simulations like
discrete-event simulation are the classical examples: in these
applications, the simulation must respect the time-order of
events even if events are processed in parallel.

Ordered algorithms are far more complex to parallelize
effectively than unordered algorithms. One promising approach,
explored in Amber Hassaan's dissertation, is to use the *Kinetic
Dependence Graph* (KDG) [1,2]. The kinetic dependence graph
is a generalization of standard dependence graphs for
applications in which the execution of a task may create new
tasks. We have used the KDG to parallelize complex simulation
codes such as asynchronous variational integrators (AVIs), which
are finite-element codes that are irregular in both space and
time (no knowledge of finite-elements is needed for this
project).

At present, KDGs are implemented in the shared-memory version of
Galois, and we have an implementation of AVI that performs quite
well on large-scale shared-memory machines. The goal of this
project is to implement a distributed-memory version of AVI
using KDGs. The basic idea is to partition the mesh between the
hosts of a distributed-memory computer (we will give you a mesh
partitioner), and let each host perform simulations on its
portion of the mesh (this can be done in parallel using
shared-memory parallelism). Naturally, the KDG has to be
implemented in a partitioned way as well. Synchronization
between hosts will be needed to update the mesh and the KDG in a
consistent way.

- (Nov 1st) Read papers, download, compile and run Galois (beta release) and AVI application
- (Nov 8th) A clear description in English of the overall parallelization strategy.
- (Nov 15th) An implementation of AVI in MPI, using shared-memory Galois within each host
- (Dec 6th) Implementation in
distributed-memory version of Galois (Abelian)

- (Dec 6th) Project report describing AVI
and its distributed memory implementation. A detailed
comparison of performance on a distributed machine (e.g. TACC)
for large inputs.

- A project report, written like an ACM paper, describing what you did for your project.

- Brief Announcement: Parallelization of Asynchronous Variational Integrators for Shared Memory Architectures , Hassaan, Nguyen and Pingali, SPAA’2014
- The
tao of parallelism in algorithms. Keshav Pingali, Donald
Nguyen, Milind Kulkarni, Martin Burtscher, M. Amber Hassaan,
Rashid Kaleem, Tsung-Hsien Lee, Andrew Lenharth, Roman
Manevich, Mario MÃ©ndez-Lojo, Dimitrios Prountzos, and Xin
Sui. In Proceedings of the ACM SIGPLAN Conference on
Programming Language Design and Implementation, PLDI '11,
pages 12-25, 2011.

- Kinetic Dependence Graphs, Hassaan, Nguyen and Pingali, ASPLOS’2015