CS395-T: Robot Learning
Course Info
Semester: Fall 2017
Time: 11-12:15 Tues / Thurs
Location: GDC 3.516
Instructor: Scott Niekum
Email: [javascript protected email address]
Prof. Office Hours: Wednesdays 1-2 PM and by appointment (GDC 3.404)
TA: Rolando Fernandez —
[javascript protected email address]
TA Office Hours: Thursdays 2:30-3 PM (GDC 3.424D)
Course Description
Many classical problems in robotics have well-understood algorithmic solutions that do not (necessarily) require learning, including
tracking, simultaneous localization and mapping, inverse kinematics, path planning, and optimal control.
Such methods are often successfully combined to solve problems in controlled settings such as factories, but have failed
to produce robust solutions to difficult tasks in unstructured dynamic environments,
such as autonomous driving and manipulation — problems that
require reasoning under uncertainty, generalization to new situations, and adaptation to change.
Fortunately, recent advances in machine learning have begun to address these challenging robotics problems by allowing robots to learn
from their own actions, experiences, and interactions with humans, providing adaptability in uncertain, novel, and changing situations.
This class will survey a wide range of modern techniques in robotics that learn from data, largely focusing on applications in manipulation.
Topics will include imitation learning, reinforcement learning, inverse reinforcement learning,
feature selection, skill acqusition, active learning, natural language processing, and human-robot interaction.
There will be no textbook. Links to all required readings will be provided in the class schedule.
Prerequisites: There are no formal prerequisites, but we will be covering material that utilizes a good deal of machine learning and
there will not be time to cover all the requisite background material. For this reason, I strongly recommend having
a graduate-level machine learning course, equivalent research experience, or the willingness to do significant studying outside of class.
Students in past years without this background have sometimes struggled and reported getting significantly less out of the class.
Schedule
8/31 — Course overview
-
Assignment 0 — Submit paper preferences by 11:59pm on 9/3
9/5 — Keyframe learning
9/7 — Trajectory learning
9/12 — Supervised policy learning
-
Primary: A reduction of imitation learning and structured prediction to no-regret online learning
Stéphane Ross, Geoffrey J. Gordon, and J. Andrew Bagnell.
arXiv preprint arXiv:1011.0686, 2011.
-
Secondary: On learning, representing, and generalizing a task in a humanoid robot
Sylvain Calinon, Florent Guenter, and Aude Billard.
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 37(2), 2007.
9/14 — Reinforcement learning I
9/19 — Reinforcement learning II
9/21 — Inverse reinforcement learning I
9/26 — Inverse reinforcement learning II
9/28 — Inverse reinforcement learning III
10/3 — Optimality in IRL
10/5 — Policy search I
10/10 — Policy search II
10/12 — Safety
10/17 — Transfer learning I
10/19 — Transfer learning II
10/24 — Skill learning I
-
Primary: Incremental semantically grounded learning from demonstration
Scott Niekum, Sachin Chitta, Andrew G. Barto, Bhaskara Marthi, and Sarah Osentoski.
Robotics: Science and Systems, 2013.
-
Secondary: Towards learning hierarchical skills for multi-phase manipulation tasks ***
Oliver Kroemer, Christian Daniel, Gerhard Neumann, Herke Van Hoof, and Jan Peters.
IEEE International Conference on Robotics and Automation, 2015.
-
Final Project, Part 2 — Due by 11:59pm on 11/13
10/26 — Skill learning II
10/31 — Object and affordance learning I
11/2 — Object and affordance learning II
11/7 — Grasping
11/9 — Prediction and planning
11/14 — Active learning
11/16 — Information gathering actions
11/21 — Interactive learning
11/23 — No class
11/28 — Dialog
11/30 — Human factors
12/5 — Final presentations
12/7 — Final presentations
Grading
Grades will be calculated as follows, using a scale that includes both plus and minus letter grades:
- 15% Reading critiques:
A written critique of the primary reading for each class will be due by 8:00 PM the previous night via Canvas.
A critique is not required for the secondary reading.
Each primary critique should include all of the following:
- A short summary of the main contribution(s) of the paper in your own words (2 or 3 sentences)
- A description of how the paper differs from prior work
- One major strength and one significant weakness of the approach
- A critique of the experiments — are they principled, sufficient, and convincing? If so, why? If not, what is missing?
- One idea for future work or an extension to the presented method
- At least one question / comments that you'd like me to address during class or that could spur discussion
In all cases, the written critique should provide non-trivial insight into the reading.
To get full credit, you must show that you understood and thought critically about the core concepts presented.
- 15% Paper presentation:
Each student will be responsible for preparing a presentation on one secondary reading during the semester (to be selected only from those labeled ***).
Plan to present for 15 minutes per paper. Please practice your presentation so that it is within
3 minutes of this target time, as this will be part of your grade.
Note that these presentations account for a significant portion of your final
grade—students are expected to be well-prepared to present insightful material that will inspire discussion afterwards.
To ensure that things go smoothly, each student will be required to
submit their slides and a discussion plan for the class by 11:59 pm three days before the presentation.
To be clear: if you present on a Tuesday, the materials are due the night of the previous Saturday;
if you present on a Thursday, the materials are due the night of the previous Monday.
The presentation should briefly touch on all of the following points:
- An overview of the main contributions
- An conceptual explanation of the technical approach
- A limited amount of mathematical explanation, if appropriate
- Comparisons to prior and/or related work
- An analysis of experimental methods and results, if any
- Opinions about the positive and negative aspects of the paper
The presentations will be graded based on the following criteria:
- Preparedness — the prepared slides should be high quality and informative, and the talk should be close to 15 minutes
- Clarity — the talk should be easy to follow, terminology should be well-defined, and each idea should follow logically from the previous idea
- Completeness — all important points in the above list should be sufficiently covered
- Correctness — all information presented should be technically correct
- Insight — the presenter should provide non-trivial insight into the paper, occasionally going beyong its contents in comparison and analysis
- 25% Programming assignments:
There will be two programming assignments designed to help students warm up for the final research project by
implementing a few core learning algorithms. Each assignment will require students to turn in
code as well as a short written report.
- 35% Final project:
Roughly halfway through the semester, students will propose topics of their own choosing for a large final programming project.
These projects may be completed alone or in pairs, but more will be expected of students working in pairs.
A rough guideline is that each student should produce about half a standard conference paper worth of material
(this means both technical content and length—about 4 double-column pages in LaTex).
These projects are a chance to dive deeply into any topic of interest related to the course. Students are encouraged
to tie this work into their primary research that they are already pursuing, which can be carried out on any physical robot or
simulated platform of their choice (it also doesn't have to be a robot; a video game might be a suitable target as well, for example).
Example projects could include extending an algorithm in a novel way, comparing several algorithms on an interesting problem, or
designing a new approach to attack a problem relevant to the class. In all cases, there should be a novel intellectual
contribution, as well as empirical results on a problem of interest.
- 10% Class participation:
Students are expected to attend class and regularly contribute to discussions. As a rule of thumb, you should
try to participate at least once per class during the presentations and/or discussions. I keep track of this and
grade accordingly—these are not "free" points and low participation will result in a low grade.
Late work policy: Critiques will not be accepted late, since their main goal is to provide fuel for discussion.
However, each student can skip up to two critiques without penalty. If you choose to skip in this way, please notify me and the TA
via an email in the standard format listed above so that I know that your submission didn't just get lost in my inbox somewhere.
No other extensions will be given for critiques, so please save these for times of necessity.
Presentations / discussion moderation cannot be late or skipped—this will result in a zero for the assignment.
However, I am happy to schedule presentations to avoid personal conflicts like holy days, conference travel, etc.
If an extreme circumstance arises that will interfere with presenting, please let me know as soon as possible.
The final project and presentation also cannot be submitted late due to timing and scheduling restrictions at the end
of the semester.
All other assignments and projects can be turned in up to one week late, at a loss of 10 points (out of 100) per late day.
Academic honesty policy
You are encouraged to discuss assignments with classmates, but all collected data, analysis, images and graphs, and
other written work must be your own. All programming assignments must
be enitrely your own except for teamwork on the final project.
You may
NOT look online for existing implementations of
algorithms related to the programming assignments, even as a reference.
Your code will be analyzed by automatic tools that detect plagiarism to ensure that it is original.
For the final project you have full access to the web, but all ideas, quotes, and code fragments that originate from elsewhere must
be cited according to standard academic practice.
Students caught cheating will automatically fail the course and will be reported to the university.
If in doubt about the ethics of any particular action, look at the
departmental
guidelines and/or ask—ignorance of the rules will not shield you from potential consequences.
Notice about students with disabilities
The University of Texas at Austin provides upon request appropriate
academic accommodations for qualified students with disabilities.
For more information, contact the Division of Diversity and Community Engagement — Services for Students with Disabilities at
512-471-6529; 512-471-4641 TTY.
Notice about missed work due to religious holy days
A student who misses an examination, work assignment, or other project
due to the observance of a religious holy day will be given an
opportunity to complete the work missed within a reasonable time after
the absence, provided that he or she has properly notified the
instructor. It is the policy of the University of Texas at Austin
that the student must notify the instructor at least fourteen days
prior to the classes scheduled on dates he or she will be absent to
observe a religious holy day. For religious holy days that fall
within the first two weeks of the semester, the notice should be given
on the first day of the semester. The student will not be penalized
for these excused absences, but the instructor may appropriately
respond if the student fails to complete satisfactorily the missed
assignment or examination within a reasonable time after the excused
absence.