Lecture Material

Basic material

(1) Course overview

Parallel architectures, parallel algorithms, parallel data structures
Slides: intro.pdf
(1) Moore's Law paper, Electronics, 1965.
(2) Static Power Model for Architects, Butts and Sohi, Micro 2000.
(3) Introduction to the Cell processor, Kahle et al, IBM J.Res&Dev, July 2005
(4) The TAO of Parallelism in Algorithms, Pingali  et al., 2011

(2) Sources of Parallelism and Locality in Regular and Irregular Algorithms
Ordinary differential equations (ode's), finite-differences, finite-elements, n-body methods, graph analytics
Slides: Some computational science algorithms   Graph algorithms MachineLearning1 MachineLearning2
(1) Mathematica tutorial on numerical methods for solving pde's
(2) Delta-stepping: A Parallel Single-Source Shortest Path Algorithm Meyer and Sanders (ESA'98)
(3) The anatomy of a large-scale hypertextual web search engine Brin and Page, Computer Networks and ISDN Systems, April 1998.

(3) Locality(I): Temporal and spatial locality, caches, blocked algorithms
Slides: Cache models for locality
(1) Anatomy of high-performance matrix multiplication, Goto et al, ACM TOMS, May 2008.

(4) Locality(II): Cache-oblivious algorithms
Slides: Cache-oblivious Programs
(1) Cache-oblivious algorithms, Frigo et al, FOCS 99
(2) An experimental comparison of cache-oblivious and cache-conscious programs, Yotov et al, SPAA 2007

(5) Vectorization

(6) Optimization for Performance
Slides: Memory Optimization Graph partitioning

(7) Synthesis of parallel programs