CS394R: Reinforcement Learning: Theory and Practice -- Fall 2016: Assignments Page

Assignments for Reinforcement Learning: Theory and Practice

Things to do ASAP (before the first class if possible)

  • Join the class discussion group (see class main page).

  • Week 0 (8/25): Class Overview

  • If you would like to get a jump on the class, read the following:
  • Chapters 1 of the course textbook (2nd edition)
  • Sign up to lead a discussion. We will try to have at most one person per day unless the class size is large, so fill slot 1s first.

  • Week 1 (8/30): Introduction and Evaluative Feedback

    Jump to the resources page.

  • Chapter 1 (until the end of Section 1.6) of the textbook
  • Introduction to Part I (just one page)
  • Chapter 2 (derivation in Section 2.7 is optional)
  • Do your first programming assignment (by Thursday)
  • For each week, be sure to submit a question or comment about each reading by 5pm on Monday as an email in plain ascii text. Please send it in the body of the email, rather than as an attachment. Please use the subject line "class readings for [due date]" and send to Peter and Sanmit (pstone@cs and sanmit@cs). Please include your name in the response. And if you refer explicitly to the reading, please include page numbers. Details on expectations for reading responses are on the main class page. Example successful responses from a previous class are available on the sample responses page.

  • Week 2 (9/6): MDPs and Dynamic Programming

    Jump to the resources page.

  • Chapters 3 and 4 of the textbook (2nd edition)

  • Week 3 (9/13): Monte Carlo Methods and TD Learning

    Jump to the resources page.

  • Chapters 5 and 6 of the textbook

  • Week 4 (9/20): Multi-Step Bootstrapping and Planning

    Jump to the resources page.

  • Chapters 7 and 8 of the textbook

  • Week 5 (9/27): Approximate On-policy Prediction and Control

    Jump to the resources page.

  • Chapters 9 and 10 of the textbook

  • Week 6 (10/4): Approximate Off-policy Methods and Eligibility Traces

    Jump to the resources page.

  • Chapters 11 and 12 of the textbook.
    NOTE: These are still incomplete drafts, so please excuse any inconsistencies.
    Also, make sure you grab the "2015sep.pdf" version of the book from the class homepage.

  • Week 7 (10/11): Applications and Case Studies

    Jump to the resources page.

  • Chapter 16 of the textbook.
    NOTE: This is still an incomplete draft, so please excuse any inconsistencies.
    Also, make sure you grab the "2016sep.pdf" version of the book from the class homepage.
  • Class project proposal due at 11:59pm on Thursday. Please send an email (to the instructor and TA) with subject "Project Proposal" with a proposed topic for your class project. I anticipate projects taking one of two forms.
  • Practice (preferred): An implemenation of RL in some domain of your choice - ideally one that you are using for research or in some other class. In this case, please describe the domain and your initial plans on how you intend to implement learning. What will the states and actions be? What algorithm(s) do you expect will be most effective?
  • Theory: A proposal, implementation and testing of an algorithmic modification to an RL algorithm presented in the book. In this case, please describe the modification you propose to investigate and on what type of domain (possibly a toy domain) it is likely to show an improvement over things considered in the book.
  • See the project page for full details on the project.

    Week 8 (10/18): Efficient Model-Based Exploration

    Jump to the resources page.

  • R-Max - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning
    Ronen Brafman and Moshe Tenenholtz
    The Journal of Machine Learning Research (JMLR) 2002
  • An Analysis of Model-Based Interval Estimation for Markov Decision Processes
    Alexander L. Strehl and Michael L. Littman
    MLJ 2008.
  • Model-Based Exploration in Continuous State Spaces
    Nicholas K. Jong and Peter Stone
    The Seventh Symposium on Abstraction, Reformulation, and Approximation, July 2007.

  • Week 9 (10/25): Abstraction: Options and Hierarchy

    Jump to the resources page.

  • Between MDPs and semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning. (a PDF link)
    Sutton, R.S., Precup, D., Singh, S.
    Artificial Intelligence 112:181-211, 1999.
  • The MAXQ Method for Hierarchical Reinforcement Learning.
    Thomas G. Dietterich
    Proceedings of the 15th International Conference on Machine Learning, 1998.
  • Hierarchical Model-Based Reinforcement Learning: Rmax + MAXQ.
    Proceedings of the 25th International Conference on Machine Learning, 2008.
    Nicholas K. Jong and Peter Stone

  • Week 10 (11/1): Multiagent RL

    Jump to the resources page.

  • Michael Littman, Markov Games as a Framework for Multi-Agent Reinforcement Learning, ICML, 1994.
  • Michael Bowling and Manuela Veloso
    Rational and Convergent Learning in Stochastic Games
    IJCAI 2001.
  • Doran Chakraborty and Peter Stone
    Convergence, Targeted Optimality and Safety in Multiagent Learning
    ICML 2010.
    journal version

  • Week 11 (11/8): Policy Gradient Methods

    Jump to the resources page.

  • Chapter 13 of the textbook.
    NOTE: This is still an incomplete draft, so please excuse any inconsistencies.
    Also, make sure you grab the "2016sep.pdf" version of the book from the class homepage.
  • Overview of Policy Gradient Methods by Jan Peters: http://www.scholarpedia.org/article/Policy_gradient_methods
  • Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion.
    Nate Kohl and Peter Stone
    In Proceedings of the IEEE International Conference on Robotics and Automation, May 2004.
  • Guided policy search
    Sergey Levine and Vladlen Koltun.
    ICML 2013.
    associated videos
  • Project literature review due at 11:59pm on Thursday. Please send an email (to the instructor and TA) with subject "Project literature review" with a proposed topic for your class project.
    See the project page for full details on the project.

    Week 12 (11/15): Inverse RL and Transfer Learning

    Jump to the resources page.

  • Apprenticeship Learning via Inverse Reinforcement Learning
    Pieter Abbeel and Andrew Ng
    ICML 2004.
  • An Introduction to Inter-task Transfer for Reinforcement Learning.
    Matthew E. Taylor and Peter Stone.
    AI Magazine, 32(1):15-34, 2011.

  • Week 13 (11/22): Deep RL

    Jump to the resources page.

  • Action-Conditional Video Prediction Using Deep Networks in ATARI Games.
    Juhnyuk Oh, Xiaoxiao Guo, Honglak Lee, Richard Lewis, and Satinder Singh.
    Neural Information Processing Systems, 2015.
    Appendix
    Videos

  • Week 14 (11/29): Project Demos

    Jump to the resources page.


    Final Project: due at 9:30am on Thursday, 12/8

    [Back to Department Homepage]

    Page maintained by Peter Stone
    Questions? Send me mail