Office hours: Tuesday 1-2pm at POB 4.126

TA: Roshan Dathathri (

TA office hours: Friday 11am-noon at POB 4.116

Canvas: for assignment submissions and grades.

Piazza: for announcements and discussions.

Course Description

To obtain the high level of end-to-end performance needed in problem domains like graphics, computer games, and scientific computing, it is necessary for programs to exploit many of the features of modern computer architectures.  In this course, we will study the performance-critical features of modern computer architectures, and discuss how applications can take advantage of them to obtain high performance.  This is not a course on software tricks; rather, the emphasis is on abstractions of computer architecture, understanding performance, and obtaining performance when you need it.

Topics include the following:

  1. Analysis of applications that need high end-to-end performance
  2. Understanding performance: performance models, Amdahl's law
  3. Measurement and design of computer experiments
  4. Microbenchmarks for abstracting performance-critical aspects of computer systems
  5. Memory hierarchy: caches, virtual memory, exploiting spatial and temporal locality
  6. Vectors and vectorization
  7. GPUs and GPU programming
  8. Multi-core processors and shared-memory programming, OpenMP
  9. Distributed-memory machines and message-passing programming, MPI
  10. Optimistic parallelization
  11. Self-optimizing software

  There will be 4 or 5 substantial programming assignments and a final project.

Prerequisites: programming maturity, knowledge of C/C++, basic course on modern computer architecture

For basic material on computer architecture, read "Computer Architecture: A Quantitative Approach"
by Hennessy & Patterson, Morgan Kaufmann Publishers.


Lecture slides and notes


Extra resources:
Reference material on Computer Architecture

Moore's Law