Course objectives: To
obtain the high level of end-to-end performance needed in
problem domains like graphics, computer games, and scientific
computing, it is necessary for programs to exploit many of the
features of modern computer architectures. In this course,
we will study the performance-critical features of modern
computer architectures, and discuss how applications can take
advantage of them to obtain high performance. This is not
a course on software tricks; rather, the emphasis is on
abstractions of computer architecture, understanding
performance, and obtaining performance when you need it.
This semester, CS377P will be co-taught by a team from Intel: Jackson Marusarz, Areg Melik-Adamyan, Gergana Slavova, and Mike Voss. They will present lectures on Intel's performance tools like VTune and Advisor, and teach students how these tools can be used to analyze and improve program performance. Course assignments will require the use of these tools.
Topics covered in lecture include the following:
- Parallelism and locality in algorithms
- Measurement of performance
- Memory hierarchies and locality exploitation
- Vector processing and vectorization
- Multi-core processors and shared-memory programming,
pThreads, OpenMP, Intel TBB
- Distributed-memory machines and message-passing programming,
MPI
- GPUs and GPU programming
- Performance analysis tools: Intel VTune and Advisor
Prerequisites:
programming maturity, knowledge of C/C++, basic course on modern
computer architecture
Course work: There will be 6-7 substantial programming
assignments (60% of grade), a mid-semester exam (15% of grade)
and a final exam (25% of grade).
Discussion and assignment: You need to use Canvas
and Piazza
for discussion and submitting assignments.