CS395T - Fall 2025

Foundations of Machine Learning for Systems Researchers

Time: Tue/Thu 12:30 to 2pm
Location: GDC 4.304

Course Overview

Rapid advances in machine learning have enabled us to build systems with capabilities that were unimaginable just a few years ago, such as AlphaGo, which plays the board game Go better than any human player, DeepSeekCoder, which can generate programs in more than 300 computer languages, and AlphaEvolve, which can discover advanced algorithms. The goal of this course is to analyze the key breakthroughs that underlie these kinds of systems, and to understand how they can be used to build systems for solving other problems.

Lectures will cover three main machine learning technologies used in these systems: deep neural networks, reinforcement learning, and evolutionary computing. Unlike in machine learning courses, this material will be presented using PL/systems concepts such as dataflow analysis. Lectures will be complemented each week by student presentations of key papers in these areas from recent AI/ML conferences. Some of these papers go into greater depth in the core AI/ML technologies while others are case studies that analyze how they are deployed in systems for board games, multiplayer games, coding assistants, and algorithm discovery.

Prerequisites

  • Proficiency in Python
  • Familiarity with undergraduate-level algorithms and data structures
  • Familiarity with calculus, statistics, and linear algebra, and strong mathematical skills
  • Coursework or equivalent experience in AI and ML at the level of CS342 is strongly recommended. Practical experience with training machine learning models will be useful.

Coursework

  • Class presentations 25%
  • Class participation 10%
  • 3-4 programming assignments 25%
  • A substantial term project 40%

Academic Honesty

You may discuss concepts with classmates, but all written work and programming assignments must be your own or your project team's work when teamwork is permitted. You may not search online for existing implementations of algorithms related to the programming assignments, even as a reference. Students caught cheating will automatically fail the course and will be reported to the university. If in doubt about the ethics of any particular action, talk to the instructor or the TA.

Course Staff

Dr. Keshav Pingali

Dr. Keshav Pingali

pingali@cs.utexas.edu
Office: POB 4.126
OH: Tue 2–3 PM
Website

Lain Mustafaoglu

Lain Mustafaoglu

zsm@utexas.edu
OH: Wed 2–3 PM
Website

Course Outcomes

By the end of this course, students will be able to:

Lecture Schedule

Below is the tentative schedule for the course. Note that dates and topics may change as the semester progresses.

Lecture Date Topic Materials/Readings Assignments & Deadlines
1 8/26 (T) Introduction
  • AlphaEvolve: A coding agent for scientific and algorithmic discovery
  • DeepSeekCoder/DeepSeek-R1
  • AlphaGo
2 8/28 (Th) Abstract NN & gradient computation Introduction post/survey due
3 9/2 DNNs, CNNs, RNNs, practical issues CNNs RNNs
4 9/4 Sparsity in DNNs
5 9/9 Attention, Transformers, LLMs
  • Attention Is All You Need (Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin, 2017)
  • (Additional readings TBA)
6 9/11 Presentations
Training LLMs
7 9/16 Monte Carlo methods & variance reduction Barto & Sutton – Ch 5 Monte Carlo Methods Assignment 1 due
Train a model with a PEFT technique
8 9/18 MDPs Barto & Sutton – Ch 3 MDPs
9 9/23 Sampling (TD(0), TD(n), MC, Q-learning) Barto & Sutton – Ch 6 Temporal Difference Learning
10 9/25 Presentations
Planning (MCTS/AlphaGo)
11 9/30 Policy gradients (I): REINFORCE
12 10/2 Presentations
Double DQN, DQN, HER
13 10/7 Policy gradients (II): Baseline methods TBD Assignment 2 due
Grid-world (≥ 2 methods) or Pong (MLP + CNN)
14 10/9 Presentations
A2C/A3C & DDPG
15 10/14 Policy gradients (III): Trust-region methods Project ideas due (meeting required)
16 10/16 Presentations
Applications of Advanced Policy Gradient methods (DeepSeek/GRPO)
17 10/21 Reinforcement Learning from Human Feedback (RLHF)
18 10/23 Presentations
RLHF
19 10/28 Evolutionary Computation Assignment 3 due
Implement an actor-critic method of choice
20 10/30 Presentations
Applications of Evolutionary Computation (AlphaEvolve, etc.)
21 11/4 Imitation Learning Tutorial on Imitation Learning Project check-in #1 (meeting required)
22 11/6 Presentations
Imitation Learning
23 11/11 Parallel/Distributed RL Tutorial Assignment 4 due
Evolve a neural network or implement a genetic algorithm
24 11/13 Presentations
Large-scale distributed RL
25 11/18 Project check-ins
26 11/20 Project check-ins
THANKSGIVING BREAK
27 12/2 Project presentations
28 12/4 Project presentations Final project paper due

Assignments

Resources