To obtain the high level of end-to-end performance needed in problem
domains like graphics, computer games, and scientific computing, it is
necessary for programs to exploit many of the features of modern
computer architectures. In this course, we will study the
performance-critical features of
modern computer architectures, and discuss how applications can take
advantage of them to obtain high performance. This is not a
course
on software tricks; rather, the emphasis is on abstractions of computer
architecture, understanding performance, and obtaining performance when
you need it.
Topics include the following:
- Analysis of applications that need high end-to-end performance
- Understanding performance: performance models, Amdahl's law
- Measurement and design of computer experiments
- Microbenchmarks for abstracting performance-critical aspects of computer systems
- Memory hierarchy: caches, virtual memory, exploiting spatial and
temporal locality
- Vectors and vectorization
- GPUs and GPU programming
- Multi-core processors and shared-memory programming, OpenMP
- Distributed-memory machines and message-passing programming, MPI
- Optimistic parallelization
- Self-optimizing software
- HPCS languages: X10, Fortress, Chapel
There will be 4 or 5 substantial programming assignments and
a
final project.
Prerequisites: programming
maturity, knowledge of C/C++, basic course on modern computer
architecture
ISBN 0-07-282256-2
For basic material on computer architecture, read "Computer Architecture: A Quantitative Approach"
by Hennessy & Patterson, Morgan Kaufmann Publishers.