Lecture Schedule

****Note: more papers will be added to the reading list later****

August

27
 Course overview: Parallel architectures, parallel algorithms, parallel data structures
Slides: Introduction to CS 395T
 Readings:  
(1) Moore's Law paper, Electronics, 1965.
(2) Static Power Model for Architects, Butts and Sohi, Micro 2000.
(3) Introduction to the Cell processor, Kahle et al, IBM J.Res&Dev, July 2005
(4) Amorphous Data-Parallelism, Pingali  et al., 2009

September

Algorithms (I): Parallelism in Computational Science Algorithms (a)
Ordinary differential equations (ode's), finite-differences, systems of ode's
Presenter: Keshav Pingali
Slides: Some computational science algorithms
Readings:
(1) Mathematica tutorial on numerical methods for solving pde's
3 Algorithms (II): Parallelism in Computational Science Algorithms (b)
Partial differential equations (pde's), linear system solvers, finite-element method
Presenter: Keshav Pingali
Slides: see September 1 lecture
8
Algorithms (III): Parallelism in Irregular algorithms (a)
Amorphous data-parallelism, N-body methods, mesh generation, mesh refinement
Presenters: Amber Hassan and Xin Sui
Slides: Introduction to Irregular Algorithms
Readings:
(1) Data parallel algorithms, Hillis and Steele, CACM, 1986
(2)
Amorphous Data-Parallelism, Pingali  et al., 2009
10
Algorithms (IV): Parallelism in Irregular algorithms (b)
Mesh refinement, Maxflow algorithms, event-driven simulation
Presenters: Amber Hassan and Xin Sui
Slides:
(1) Barnes-Hut
(2)
Mesh Generation and Graph Partitioning
(3)
Preflow Push

 

15
Abstractions for regular algorithms and machines: Dependence graphs,data dependences,control dependences, PRAM model, DAG scheduling
Presenter: Keshav Pingali
Slides: Algorithm and machine abstractions: dependence graphs and PRAM model
 
Control dependence computation
Readings:
(1) Dependence graphs and compiler optimizations, Kuck et al., POPL 1981
(2) The program dependence graph and its use in optimization, Ferrante, Ottenstein,Warren, TOPLAS, 1987
(3) Optimal control dependence computation, Pingali and Bilardi, TOPLAS, 1997
(4) Experimental evaluation of list scheduling, Cooper et al, Rice TR, 1998
(5) From control flow to dataflow, Beck et al., JPDC 1989
17
Abstractions for irregular algorithms and machines: halographs, optimistic execution of programs, dynamic scheduling
Presenter: Donald Nguyen
Slides: see amorphous data-parallelism slides
22
Architecture (I): Multicore architectures, cache coherence
Presenters: Manish Arora,Mrinal Deo
Slides: Coherent caches
24
Architecture (II): Locks, lock-free synchronization, memory consistency models
Presenter: Ivan Jibaja
Slides: Memory consistency models
29 Dynamic load-balancing
Presenters: Rashid Kaleem, Amber Hassan
Slides: Dynamic load-balancing
Readings:
(1)Load Balancing literature survey
(2) Scheduling multi-threaded computations by work-stealing, Blumofe and Leiserson, JACM, 1999.

October
Parallel data structures(I): Lock/wait-free data structures
Presenter: Augustine Matthews
Slides:
Locks and lock-free synchronization
Parallel data structures(II): Galois data structures, array and graph partitioning
Presenter: Donald Nguyen
Readings:
(1) An efficient heuristic procedure for partitioning graphs, Kernighan and Lin, Bell System Technical Journal, 1970.
(2) A fast and high quality multilevel scheme etc. Karypis and Kumar, SIAM J. Sci. Comput. 1998.
Parallel data structures(III): Transactional memory
Presenters:Srivastava Daruru, Saurabh Shukla
Readings:
(1)Software Transactional Memory, Nir Shavit, Dan Touitou, PODC 1995
(2) Transactional Memory Architectural Support for Lock-Free Data Structures, Maurice Herlihy, J. Eliot B. Moss ISCA 1993.

13 Locality(I): Temporal and spatial locality in algorithms, blocking, unit-stride accesses
Presenter: Keshav Pingali
Slides: Cache models for locality
Readings:
(1) Evaluation techniques for storage hierarchies, Mattson et al,  IBM Systems Journal, 1970.
15  Locality(II): Case studies: MMM, matrix factorization, stencil codes
Presenter:Keshav Pingali

Readings:
(1) Anatomy of high-performance matrix multiplication, Goto et al, ACM TOMS, May 2008.
(2) Optimizing matrix multiply using PHiPAC, Biles et al, LAPACK Working Note 111.
20  Locality(III): Cache-oblivious algorithms
Presenter:Keshav Pingali
Readings:
(1) Cache-oblivious algorithms, Frigo et al, FOCS 99
(2) An experimental comparison of cache-oblivious and cache-conscious programs, Yotov et al, SPAA 2007
22  Compiler analysis and transformation (I): Integer linear programming, dependence analysis of dense array programs
Presenter: Keshav Pingali
Readings:
(1) The Omega test, Pugh, Supercomputing 91
27  Compiler analysis and transformation (II): Loop transformations of dense array programs
Presenter:Keshav Pingali
29 Compiler analysis and transformation (III): Points-to and shape analysis
Presenter:Dimitrios Prountzos
Slides: Analysis of programs with pointers
Readings:
(1) Tutorial on points-to analysis, Michael Hind


November
3
Performance modeling: PRAM, BPRAM, logP
Presenter:Keshav Pingali
5
Auto-tuning (I): ATLAS, FFTW
Presenter:Keshav Pingali
Slides: Optimizing MMM and the ATLAS code generator
Readings:
(1) Is search really necessary to generate high-performance BLAS?,  Yotov et al, Proceedings of IEEE, March 2005.
10
Auto-tuning (II): Machine learning techniques for program optimization
Presenters:Amin Shali
12
Special topics: GPU programming
Presenters:Apollo Ellis
Readings:
(1) A survey of general-purpose computation on graphics hardware, Owens et al, Eurographics 2005.
17
Parallel language/library case studies (I): MPI
Presenter:Keshav Pingali
Slides: Introduction to MPI Advanced MPI
Readings:
(1) MPI Groups and Topologies, J. Squyers, ClusterWorld 2004
19
Parallel language/library case studies (II): PGAS languages
Presenters:TBD
24
Parallel language/library case studies (III): Cilk, TBB, Map-reduce
Presenters:Sangmin Lee, Yang Wang
Readings:
(1) Cilk, an efficient multithreaded runtime system, Blumofe et al, PPoPP 1995

December
1 Parallel language/library case studies (IV): functional languages and dataflow
Presenter:Keshav Pingali
3 Research directions in parallel programming
Presenter:Keshav Pingali