Overview, Syllabus, Structure

In this course, students will learn about the architecture, design, and implementation of software and hardware systems that play a central role in modern machine learning training and inference, and in large-scale data processing which plays an important role in training/inference. Lectures will discuss various state-of-the-art frameworks for programming for distributed training and inference and big data processing, as well as for their compilation, planning, scheduling, and distributed execution. On the theory front, after covering the basics of modern hardware and software infrastructures that these machine learning systems leverage, we will explore the systems themselves from the ground up. Specifically, topics we cover will include:

On the practice front, we will do programming 3-4 assignments covering: See a tentative outline of planned assignments here.

Administrative Details

Class time

Monday, Wednesday 9:30 am - 11:00 am

Class location

UTC 4.110

Pre-requisites (not strict, but useful)

CS429 and CS439 are useful but not required. A working knowledge of machine learning and computer systems will suffice to take this course as background relevant to the lecture, programming assignments and quizzes will be provided in class. Programming proficiency in Python is strongly encouraged.

Grading

The course will have 4 in-class quizzes (no midterms or finals), and 3 to 4 programming assignments that dovetail with in-class discussions of the topics above. The overall course grade will be evenly split between quizzes and programming assignments.

People

Instructor: Aditya Akella
Email: akella@cs.utexas.edu
Office: GDC 6.826 or Zoom
Office Hours: TBD

Instructor: Bodun Hu
Email: bodunhu@utexas.edu
Office Hours: TBD

Instructor: Brian Chang
Email: brianchang@utexas.edu
Office Hours: TBD