Course objectives: To obtain the high level of
end-to-end performance needed in problem domains like graphics,
computer games, and scientific computing, it is necessary for
programs to exploit many of the features of modern computer
architectures. In this course, we will study the
performance-critical features of modern computer architectures,
and discuss how applications can take advantage of them to
obtain high performance. This is not a course on software
tricks; rather, the emphasis is on abstractions of computer
architecture, understanding performance, and obtaining
performance when you need it.
Topics covered in lecture include the following:
- Analysis of applications that need high end-to-end
performance
- Understanding performance: performance models, Amdahl's law
- Measurement and design of computer experiments
- Microbenchmarks for abstracting performance-critical aspects of computer systems
- Memory hierarchy: caches, virtual memory, exploiting spatial
and temporal locality
- Vectors and vectorization
- GPUs and GPU programming
- Multi-core processors and shared-memory programming, OpenMP
- Distributed-memory machines and message-passing programming,
MPI
- Optimistic parallelization
- Self-optimizing software
Prerequisites:
programming maturity, knowledge of C/C++, basic course on modern
computer architecture
Course work: There will 6 substantial programming
assignments (60% of grade), a mid-semester exam (15% of grade)
and a final exam (25% of grade).
Discussion and assignment:
You need to use Canvas
and Piazza
for discussion and submitting assignments.
Please enroll in Piazza and follow the instruction to create TACC account ASAP.