- Structure of parallelism and locality in important
algorithms
- Algorithm abstractions: dependence graphs, halographs
- Multicore architectures: interconnection networks, cache
coherence, locks, lock-free synchronization
- Memory consistency models
- Optimistic parallel execution of programs
- Scheduling and load-balancing
- Parallel data structures: linearizability, lock-free data structures, transactional memory, array/graph partitioning
- Memory hierarchies and locality
- Cache-oblivious algorithms
- Compiler analysis and transformations for regular and irregular programs
- Performance models: PRAM, BPRAM, logP
- Self-optimizing software, machine-learning
techniques for program optimization
- GPUs and GPU programming
- Case studies: Cilk, PGAS languages, TBBs, Map-reduce
- Approximate computing for power and energy optimization
Students will present papers, participate in discussions, and
do a
substantial final project. The readings will include some of the
classic papers in the field of parallel programming.
Prerequisites:
programming
maturity, knowledge of C/C++, basic courses on modern computer
architecture and compilers
For basic material on computer architecture, read "Computer Architecture: A Quantitative Approach"
by Hennessy & Patterson, Morgan Kaufmann Publishers. For basic material on compilers, read "Optimizing Compilers for Modern Architectures" by Allen and Kennedy.
Lecture schedule and notes
Assignments
Announcements
Presentation Schedule and Readings