CS395-T: Robot Learning

Course Info

Semester: Fall 2017
Time: 11-12:15 Tues / Thurs
Location: GDC 3.516
Instructor: Scott Niekum
Email: [javascript protected email address]
Prof. Office Hours: Wednesdays 1-2 PM and by appointment (GDC 3.404)
TA: Rolando Fernandez — [javascript protected email address]
TA Office Hours: Thursdays 2:30-3 PM (GDC 3.424D)

Course Description

Many classical problems in robotics have well-understood algorithmic solutions that do not (necessarily) require learning, including tracking, simultaneous localization and mapping, inverse kinematics, path planning, and optimal control. Such methods are often successfully combined to solve problems in controlled settings such as factories, but have failed to produce robust solutions to difficult tasks in unstructured dynamic environments, such as autonomous driving and manipulation — problems that require reasoning under uncertainty, generalization to new situations, and adaptation to change.

Fortunately, recent advances in machine learning have begun to address these challenging robotics problems by allowing robots to learn from their own actions, experiences, and interactions with humans, providing adaptability in uncertain, novel, and changing situations. This class will survey a wide range of modern techniques in robotics that learn from data, largely focusing on applications in manipulation. Topics will include imitation learning, reinforcement learning, inverse reinforcement learning, feature selection, skill acqusition, active learning, natural language processing, and human-robot interaction.

There will be no textbook. Links to all required readings will be provided in the class schedule.

Prerequisites: There are no formal prerequisites, but we will be covering material that utilizes a good deal of machine learning and there will not be time to cover all the requisite background material. For this reason, I strongly recommend having a graduate-level machine learning course, equivalent research experience, or the willingness to do significant studying outside of class. Students in past years without this background have sometimes struggled and reported getting significantly less out of the class.


8/31 — Course overview

  • Assignment 0 — Submit paper preferences by 11:59pm on 9/3

9/5 — Keyframe learning

9/7 — Trajectory learning

9/12 — Supervised policy learning

9/14 — Reinforcement learning I

9/19 — Reinforcement learning II

9/21 — Inverse reinforcement learning I

9/26 — Inverse reinforcement learning II

9/28 — Inverse reinforcement learning III

10/3 — Optimality in IRL

10/5 — Policy search I

10/10 — Policy search II

10/12 — Safety

  • Primary: Trust Region Policy Optimization
    John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz.
    International Conference on Machine Learning, 2015.

  • Secondary: High Confidence Off-Policy Evaluation ***
    Philip S. Thomas, Georgios Theocharous, and Mohammad Ghavamzadeh.
    AAAI Conference on Artificial Intelligence, 2015.

10/17 — Transfer learning I

10/19 — Transfer learning II

10/24 — Skill learning I

10/26 — Skill learning II

10/31 — Object and affordance learning I

11/2 — Object and affordance learning II

11/7 — Grasping

11/9 — Prediction and planning

11/14 — Active learning

11/16 — Information gathering actions

11/21 — Interactive learning

11/23 — No class

11/28 — Dialog

11/30 — Human factors

12/5 — Final presentations

12/7 — Final presentations


Grades will be calculated as follows, using a scale that includes both plus and minus letter grades:

  • 15% Reading critiques:

    A written critique of the primary reading for each class will be due by 8:00 PM the previous night via Canvas. A critique is not required for the secondary reading. Each primary critique should include all of the following:

    • A short summary of the main contribution(s) of the paper in your own words (2 or 3 sentences)
    • A description of how the paper differs from prior work
    • One major strength and one significant weakness of the approach
    • A critique of the experiments — are they principled, sufficient, and convincing? If so, why? If not, what is missing?
    • One idea for future work or an extension to the presented method
    • At least one question / comments that you'd like me to address during class or that could spur discussion

    In all cases, the written critique should provide non-trivial insight into the reading. To get full credit, you must show that you understood and thought critically about the core concepts presented.

  • 15% Paper presentation:

    Each student will be responsible for preparing a presentation on one secondary reading during the semester (to be selected only from those labeled ***). Plan to present for 15 minutes per paper. Please practice your presentation so that it is within 3 minutes of this target time, as this will be part of your grade. Note that these presentations account for a significant portion of your final grade—students are expected to be well-prepared to present insightful material that will inspire discussion afterwards.

    To ensure that things go smoothly, each student will be required to submit their slides and a discussion plan for the class by 11:59 pm three days before the presentation. To be clear: if you present on a Tuesday, the materials are due the night of the previous Saturday; if you present on a Thursday, the materials are due the night of the previous Monday. The presentation should briefly touch on all of the following points:

    • An overview of the main contributions
    • An conceptual explanation of the technical approach
    • A limited amount of mathematical explanation, if appropriate
    • Comparisons to prior and/or related work
    • An analysis of experimental methods and results, if any
    • Opinions about the positive and negative aspects of the paper

    The presentations will be graded based on the following criteria:

    • Preparedness — the prepared slides should be high quality and informative, and the talk should be close to 15 minutes
    • Clarity — the talk should be easy to follow, terminology should be well-defined, and each idea should follow logically from the previous idea
    • Completeness — all important points in the above list should be sufficiently covered
    • Correctness — all information presented should be technically correct
    • Insight — the presenter should provide non-trivial insight into the paper, occasionally going beyong its contents in comparison and analysis

  • 25% Programming assignments:

    There will be two programming assignments designed to help students warm up for the final research project by implementing a few core learning algorithms. Each assignment will require students to turn in code as well as a short written report.

  • 35% Final project:

    Roughly halfway through the semester, students will propose topics of their own choosing for a large final programming project. These projects may be completed alone or in pairs, but more will be expected of students working in pairs. A rough guideline is that each student should produce about half a standard conference paper worth of material (this means both technical content and length—about 4 double-column pages in LaTex).

    These projects are a chance to dive deeply into any topic of interest related to the course. Students are encouraged to tie this work into their primary research that they are already pursuing, which can be carried out on any physical robot or simulated platform of their choice (it also doesn't have to be a robot; a video game might be a suitable target as well, for example).

    Example projects could include extending an algorithm in a novel way, comparing several algorithms on an interesting problem, or designing a new approach to attack a problem relevant to the class. In all cases, there should be a novel intellectual contribution, as well as empirical results on a problem of interest.

  • 10% Class participation:

    Students are expected to attend class and regularly contribute to discussions. As a rule of thumb, you should try to participate at least once per class during the presentations and/or discussions. I keep track of this and grade accordingly—these are not "free" points and low participation will result in a low grade.

Late work policy: Critiques will not be accepted late, since their main goal is to provide fuel for discussion. However, each student can skip up to two critiques without penalty. If you choose to skip in this way, please notify me and the TA via an email in the standard format listed above so that I know that your submission didn't just get lost in my inbox somewhere. No other extensions will be given for critiques, so please save these for times of necessity.

Presentations / discussion moderation cannot be late or skipped—this will result in a zero for the assignment. However, I am happy to schedule presentations to avoid personal conflicts like holy days, conference travel, etc. If an extreme circumstance arises that will interfere with presenting, please let me know as soon as possible. The final project and presentation also cannot be submitted late due to timing and scheduling restrictions at the end of the semester.

All other assignments and projects can be turned in up to one week late, at a loss of 10 points (out of 100) per late day.

Academic honesty policy

You are encouraged to discuss assignments with classmates, but all collected data, analysis, images and graphs, and other written work must be your own. All programming assignments must be enitrely your own except for teamwork on the final project. You may NOT look online for existing implementations of algorithms related to the programming assignments, even as a reference. Your code will be analyzed by automatic tools that detect plagiarism to ensure that it is original. For the final project you have full access to the web, but all ideas, quotes, and code fragments that originate from elsewhere must be cited according to standard academic practice. Students caught cheating will automatically fail the course and will be reported to the university. If in doubt about the ethics of any particular action, look at the departmental guidelines and/or ask—ignorance of the rules will not shield you from potential consequences.

Notice about students with disabilities

The University of Texas at Austin provides upon request appropriate academic accommodations for qualified students with disabilities. For more information, contact the Division of Diversity and Community Engagement — Services for Students with Disabilities at 512-471-6529; 512-471-4641 TTY.

Notice about missed work due to religious holy days

A student who misses an examination, work assignment, or other project due to the observance of a religious holy day will be given an opportunity to complete the work missed within a reasonable time after the absence, provided that he or she has properly notified the instructor. It is the policy of the University of Texas at Austin that the student must notify the instructor at least fourteen days prior to the classes scheduled on dates he or she will be absent to observe a religious holy day. For religious holy days that fall within the first two weeks of the semester, the notice should be given on the first day of the semester. The student will not be penalized for these excused absences, but the instructor may appropriately respond if the student fails to complete satisfactorily the missed assignment or examination within a reasonable time after the excused absence.