Online Learning and Optimization

(CS 395T)

Request Info

This class has two major themes: algorithms for convex optimization and algorithms for online learning. The first part of the course will focus on algorithms for large scale convex optimization. A particular focus of this development will be for problems in Machine Learning, and this will be emphasized in the lectures, as well as in the problem sets. The second half of the course will then turn to applications of these ideas to online learning.

  • What You Will Learn

    • Techniques for convex optimization such as gradient descent and its variants
    • Algorithms for online learning such as follow the leader and weighted majority
    • Multi-Armed Bandit problem and its variants


    • Convex sets and Convex functions, including basic definitions of convexity, smoothness and strong convexity
    • First order optimality conditions for unconstrained and constrained convex optimization problems
    • Gradient and subgradient descent: Lipschitz functions, Smooth functions, Smooth and Strongly Convex functions
    • Oracle Lower Bounds
    • Accelerated Gradient Methods
    • Proximal and projected gradient descent. ISTA and FISTA
    • Mirror Descent
    • Frank Wolfe
    • Stochastic Gradient Descent
    • Stochastic bandits with finite number of arms: Explore and commit algorithm, UCB algorithm and regret analysis
    • Adversarial bandits with finite number of arms: Exponential weighting and importance sampling, Exp3 algorithm and variants
    • Multi-armed Bandit (MAB) lower bounds: minimax bounds, problem-dependent bounds
    • Contextual bandits: Bandits with experts — the Exp4 algorithm, stochastic linear bandits, UCB algorithm with confidence balls (LinUCB and variants)
    • Contextual bandits in the adversarial setting: Online linear optimization (with full and bandit feedback), Follow The Leader (FTL) and Follow the Regularized Leader (FTRL), Mirror Descent
    • Online Classification: Halfing algorithm, Weighted majority algorithm, Perceptron and Winnow algorithms (with connections to Online Gradient Descent and Online Mirror Descent)
    • Other Topics: Combinatorial bandits, Bandits for pure exploration, Bandits in a Bayesian setting, Thompson sampling
    • Newton and Quasi-Newton Methods
  • Estimated Effort

    • 8-12 hours per week

    Course Category

    • Theory Course

    Course Availability

    • Spring 2022

Take the Next Step

Advance your computer science career with UT Austin's Master of Computer Science Online.