UTCS Reinforcement Learning Reading Group
The UTCS Reinforcement Learning Reading Group is a student run group that discusses research papers related to reinforcement learning. Ever since its first meeting in the spring of 2004, the group has served as a forum for students to discuss interesting research ideas in an informal setting. Meetings are usually held in the afternoon and refreshments are provided. Occasionally, the group hosts invited talks. The group was coordinated during the period Spring 2004 - Fall 2005 by Matt Taylor. Currently, it is managed by Shivaram Kalyanakrishnan.
This page provides information about group meetings. Also, it lists useful resources for reinforcement learning, and serves as a repository of all past readings.
New members are always welcome! Interested students or researchers may also subscribe to the group e-mailing list.
Meeting Time and Place
The group will meet at noon every other Friday in ENS 32NEA. Meeting time and place may change on occasion.
Next Meeting
Yet to be scheduled.
Communication
The reading group has an e-mailing list (rlreadinggroup@utlists.utexas.edu) on which regular announcements are made.
- To subscribe to the list or to unsubscribe from it, send your request through e-mail to shivaram@cs.utexas.edu.
- To send e-mail to the list, address rlreadinggroup@utlists.utexas.edu.
Reinforcement Learning Resources
Suggested Readings for Future Meetings
-
Maximum Entropy Inverse Reinforcement Learning
Brian Ziebart, Andrew Maas, J. Andrew Bagnell and Anind Dey, 2008
Suggestion from Peter Stone and Kurt Dresner.
Paper Readings and Talks (Reverse Chronological Order)
Summer 2008
-
Planning and Learning in Environments with Delayed Feedback
Thomas Walsh, Ali Nouri, Lihong Li and Michael Littman, 2007
Discussion led by Shivaram Kalyanakrishnan, June 20, 2008.
-
Adaptive Treatment of Epilepsy via Batch-mode Reinforcement Learning
Arthur Guez, Robert Vincent, Massimo Avoli and Joelle Pineau, 2008
Discussion led by Matt Taylor, May 30, 2008.
Spring 2008
-
Perspectives on Reinforcement Learning: Group Discussion
Discussion led by Michael Quinlan, April 25, 2008.
-
Decentralized Control of Cooperative Systems: Categorization and Complexity Analysis
Claudia Goldman and Shlomo Zilberstein, 2004
Discussion led by Doran Chakraborty, April 11, 2008.
-
Planning with Durative Actions in Stochastic Domains
Mausam and Daniel Weld, 2007
Discussion led by Doran Chakraborty, April 4, 2008.
-
Fourteen Declarative Principles for an Integrative Science of the Temporal Dynamics of Learning
Rich Sutton, 2008
Discussion led by Todd Hester, March 21, 2008.
-
Factor-Guided Motion Planning for a Robot Arm
Jaesik Choi and Eyal Amir, 2007
Discussion led by Michael Quinlan, February 29, 2008.
Invited Speaker: Eyal Amir.
-
Learning to Play Using Low-Complexity Rule-Based Policies: Illustrations through Ms. Pac-Man
István Szita and András Lőrincz, 2007
Discussion led by Matt Taylor, February 15, 2008.
-
A Natural Policy Gradient
Sham Kakade, 2002
Discussion led by Andrew Dreher, January 25, 2008.
Fall 2007
-
ICML-07 Tutorial on Bayesian Methods for Reinforcement Learning, Part 4
Pascal Poupart, Mohammad Ghavamzadeh and Yaakov Engel, 2007
Reinforcement learning with Gaussian processes
Yaakov Engel, Shie Mannor and Ron Meir, 2005
Discussion led by Joe Reisinger, December 7, 2007.
-
ICML-07 Tutorial on Bayesian Methods for Reinforcement Learning, Parts 1, 2, 3
Pascal Poupart, Mohammad Ghavamzadeh and Yaakov Engel, 2007
Discussion led by Nick Jong, November 30, 2007.
-
Constructing Basis Functions from Directed Graphs for Value Function Approximation
Jeff Johns and Sridhar Mahadevan, 2007
Proto-Value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes
Sridhar Mahadevan and Mauro Maggioni, 2007
Discussion led by Matt Taylor, November 16, 2007.
-
The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems
Caroline Claus and Craig Boutilier, 1998
Discussion led by Doran Chakraborty, November 9, 2007.
-
A new mathematical framework for optimal choice of actions
Emanuel Todorov, 2007
Discussion led by Ian Fasel and Shivaram Kalyanakrishnan, October 12, 2007.
-
Combining Online and Offline Knowledge in UCT
Sylvain Gelly and David Silver, 2007
Discussion led by Joe Reisinger, September 21, 2007.
-
Dynamic Positioning in 3D RoboCup Soccer
Presentation by Sahar Asadi, September 7, 2007.
Summer 2007
-
Dirichlet Process Mixtures
Khalid El-Arini, 2005
A Bayesian Framework for Reinforcement Learning
Malcolm Strens, 2000
Discussion led by Nick Jong, June 29, 2007.
-
Multi-Task Reinforcement Learning: A Hierarchical Bayesian Approach
Aaron Wilson, Alan Fern, Soumya Ray and Prasad Tadepalli, 2007
Discussion led by Todd Hester, June 15, 2007.
-
Nash Q-Learning for General-Sum Stochastic Games
Junling Hu and Michael Wellman, 2003
Discussion led by Shimon Whiteson, June 8, 2007.
-
Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games
Colin McMillen and Manuela Veloso, 2007
Discussion led by Shivaram Kalyanakrishnan, June 1, 2007.
-
Efficient Reinforcement Learning with Relocatable Action Models
Bethany Leffler, Michael Littman and Timothy Edmunds, 2007
Discussion led by Matt Taylor, May 25, 2007.
Spring 2007
-
Adaptive Representations for Reinforcement Learning
Shimon Whiteson, 2007
Ph.D. defense by Shimon Whiteson, April 20, 2007.
-
Integrating Guidance into Relational Reinforcement Learning
Kurt Driessens and Sašo Džeroski, 2004
Discussion led by Andrew Dreher, April 6, 2007.
-
Online Learning and Exploiting Relational Models in Reinforcement Learning
Tom Croonenborghs, Jan Ramon, Hendrick Blockeel and Maurice Bruynooghe, 2007
Discussion led by Matt Taylor, March 23, 2007.
-
State Similarity Based Approach for Improving Performance in RL
Sertan Girgin, Faruk Polat and Reda Alhajj, 2007
Discussion led by Todd Hester, March 9, 2007.
-
Deictic Option Schemas
Balaraman Ravindran, Andrew Barto And Vimal Mathew, 2007
Discussion led by Rahul Iyer, March 2, 2007.
-
Bayesian Q-learning
Richard Dearden, Nir Friedman and Stuart Russell, 1998
Discussion led by David Pardoe, February 9, 2007.
-
An Intrinsic Reward Mechanism for Efficient Exploration
Özgür Şimşek and Andrew Barto, 2006
Discussion led by Shivaram Kalyanakrishnan, January 26, 2007.
Fall 2006
-
Decision Tree Methods for Finding Reusable MDP Homomorphisms
Alicia Wolfe and Andrew Barto, 2006
Discussion led by Rahul Iyer, December 4, 2006.
-
Robot planning in partially observable continuous domains
Josep Porta, Matthijs Spaan and Nikos Vlassis, 2005
Discussion led by Igor Karpov, November 20, 2006.
-
Reinforcement Learning in POMDP's via Direct Gradient Ascent
Jonathan Baxter and Peter Bartlett, 2000
Discussion led by Yaxin Liu, November 6, 2006.
-
Reinforcement Learning for Optimized Trade Execution
Yuriy Nevmyvaka, Yi Feng and Michael Kearns, 2006
Discussion led by Andrew Dreher, October 23, 2006.
-
Looping Suffix Tree-Based Inference of Partially Observable Hidden State
Michael Holmes and Charles Isbell, Jr., 2006
Discussion led by Nick Jong, October 9, 2006.
-
Sparse Cooperative Q-learning
Jelle Kok and Nikos Vlassis, 2004
Discussion led by Shivaram Kalyanakrishnan, September 25, 2006.
-
Using Homomorphisms to Transfer Options across Continuous Reinforcement Learning Domains
Vishal Soni and Satinder Singh, 2006
Discussion led by Matt Taylor, September 11, 2006.
Summer 2006
Spring 2006
-
State Space Reduction for Autonomous Reinforcement Learning
Mehran Asadi and Manfred Huber, 2004
Accelerating Action Dependent Hierarchical Reinforcement Learning Through Autonomous Subgoal Discovery
Mehran Asadi and Manfred Huber, 2005
Discussion led by Nick Jong, April 28, 2006.
-
Autonomous Helicopter Flight via Reinforcement Learning
Andrew Ng, H. Jin Kim, Michael Jordan and Shankar Sastry, 2004
Discussion led by Shivaram Kalyanakrishnan, April 14, 2006.
-
Policy Gradient Methods for Reinforcement Learning with Function Approximation
Richard Sutton, David McAllester, Satinder Singh and Yishay Mansour, 2000
Discussion led by Shimon Whiteson, March 31, 2006.
-
Building Portable Options: Skill Transfer in Reinforcement Learning
George Konidaris and Andrew Barto, 2006
Discussion led by Matt Taylor, March 10, 2006.
-
CBR for State Value Function Approximation in Reinforcement Learning
Thomas Gabel and Martin Riedmiller, 2005
Discussion led by Nick Jong, February 24, 2006.
-
Why (PO)MDPs Lose for Spatial Tasks and What to Do About It
Terran Lane and William Smart, 2005
Discussion led by Shivaram Kalyanakrishnan, February 10, 2006.
-
Temporal-Difference Networks
Richard Sutton and Brian Tanner, 2005
Discussion led by Bikram Banerjee, January 27, 2006.
Fall 2005
-
Developing navigation behavior through self-organizing distinctive state abstraction
Jefferson Provost, Benjamin Kuipers and Risto Miikkulainen, 2006
Discussion led by Jeff Provost, December 14, 2005.
-
An Algorithmic Description of XCS
Martin Butz and Stewart Wilson, 2000
Discussion led by David Pardoe, December 2, 2005.
-
Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method
Martin Riedmiller, 2005
A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm
Martin Riedmiller and Heinrich Braun, 1993
Discussion led by Shivaram Kalyanakrishnan, November 18, 2005.
-
The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State Spaces
Andrew Moore and Christopher Atkeson, 1995
Discussion led by Nick Jong, November 4, 2005.
-
Samuel Meets Amarel: Automating Value Function Approximation using Global State Space Analysis
Sridhar Mahadevan, 2005
Discussion led by Jeff Provost, October 14, 2005.
-
Near-Optimal Reinforcement Learning in Polynomial Time
Michael Kearns and Satinder Singh, 1998
Discussion led by Yaxin Liu, September 23, 2005.
Summer 2005
-
An Empirical Evaluation of Interval Estimation for Markov Decision Processes
Alexander Strehl and Michael Littman, 2004
Discussion led by Shimon Whiteson, August 26, 2005.
-
Using Advice to Transfer Knowledge Acquired in One Reinforcement Learning Task to Another
Lisa Torrey, Trevor Walker, Jude Shavlik and Richard Maclin, 2005
Discussion led by Matt Taylor, August 12, 2005.
-
Guiding Inference through Relational Reinforcement Learning
Nima Asgharbyegi, Negin Nejati, Pat Langley and Sachiyo Arai, 2005
Discussion led by Matt Taylor, June 17, 2005.
-
Least-Squares Policy Iteration
Michail Lagoudakis and Ronald Parr, 2003
Discussion led by Lily Mihalkova, May 27, 2005.
-
Relational Reinforcement Learning
Sašo Džeroski, Luc De Raedt and Kurt Driessens, 2001
Discussion led by Greg Kuhlmann, May 6, 2005.
Spring 2005
-
Intrinsically Motivated Reinforcement Learning
Satinder Singh, Andrew Barto and Nuttapong Chentanez, 2005
Discussion led by Nick Jong, April 22, 2005.
-
Gradient Descent for General Reinforcement Learning
Leemon Baird and Andrew Moore, 1999
Discussion led by Shimon Whiteson, April 8, 2005.
-
Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks
Chris Drummond, 2002
Discussion led by Matt Taylor, March 25, 2005.
-
A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes
Michael Kearns, Yishay Mansour and Andrew Ng, 2001
Discussion led by David Pardoe, March 4, 2005.
-
No Free Lunch Theorems for Optimization
David Wolpert and William Macready, 1996
Discussion led by Shimon Whiteson, February 18, 2005.
-
Common myths and misstatements about reinforcement learning
Various Authors, 1999 onwards
Discussion led by Matt Taylor, February 4, 2005.
Fall 2004
-
Reinforcement Learning as Classification: Leveraging Modern Classifiers
Michail Lagoudakis and Ronald Parr, 2003
Discussion led by Nick Jong, November 29, 2004.
-
Discovering Hierarchy in Reinforcement Learning with HEXQ
Bernhard Hengst, 2002
Discussion led by Mazda Ahmadi, 15 November, 2004.
-
Synthesizing Policy Search and Temporal Difference Methods for Reinforcement Learning
Discussion led by Shimon Whiteson , November 1, 2004.
-
Implicit Negotiation in Repeated Games
Michael Littman and Peter Stone, 2001
Discussion led by Matt Taylor, October 25, 2004.
-
Markov Games as a Framework for Multi-Agent Reinforcement Learning
Michael Littman, 1994
Discussion led by David Pardoe, October 11, 2004.
-
RL Methodology
Discussion led by Lily Mihalkova, September 20, 2004.
Summer 2004
-
Three Automated Stock-Trading Agents: A Comparative Study
Alexander Sherstov and Peter Stone, 2004
Practice talk by Sasha Sherstov, July 9, 2004.
Bidding for Customer Orders in TAC SCM
David Pardoe and Peter Stone, 2004
Practice talk by David Pardoe, July 9, 2004.
-
The MAXQ Method for Hierarchical Reinforcement Learning
Thomas Dietterich, 1998
Reinforcement Learning: A Survey, Section 6
Leslie Kaelbling, Michael Littman and Andrew Moore, 1996
Discussion led by Jeff Provost.
-
Acting Optimally in Partially Observable Stochastic Domains
Anthony Cassandra, Leslie Kaelbling and Michael Littman, 1994
Discussion led by Shimon Whiteson, June 11, 2004.
-
Reinforcement Learning: A Survey, Sections 4 and 5
Leslie Kaelbling, Michael Littman and Andrew Moore, 1996
Reinforcement Learning: An Introduction, Chapter 9
Richard Sutton and Andrew Barto, 1998
Discussion on Model free and model based learning led by Peggy Fidelman, May 29, 2004.
-
Residual Algorithms: Reinforcement Learning with Function Approximation
Leemon Baird, 1995
Discussion led by Lily Mihalkova, May 14, 2004.
Spring 2004
-
A Quantitative Study of Hypothesis Selection
Philip Fong, 1995
Discussion led by Nick Jong, April 23, 2004.
-
Machine Learning for Fast Quadrupedal Locomotion
Nate Kohl and Peter Stone, 2004
Practice Talk by Nate Kohl, April 16, 2004.
-
Policy invariance under reward transformations: Theory and application to reward shaping
Andrew Ng, Daishi Harada and Stuart Russell, 1999
Discussion led by Greg Kuhlmann, April 9, 2004.
-
Reinforcement Learning with Replacing Eligibility Traces
Satinder Singh and Richard Sutton, 1996
Discussion led by David Pardoe, March 26, 2004.
-
Learning to Predict by the Methods of Temporal Differences
Richard Sutton, 1998
Discussion led by Sasha Sherstov, March 5, 2004.
-
On the Complexity of Solving Markov Decision Problems
Michael Littman, Thomas Dean and Leslie Kaelbling, 1995
Discussion led by Matt Taylor, February 20, 2004.
-
Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces
Juan Santamaría, Richard Sutton and Ashwin Ram, 1998
Discussion led by Nick Jong, February 6, 2004.
-
Reinforcement Learning: A Survey
Leslie Kaelbling, Michael Littman and Andrew Moore, 1996
General background discussion, January 23, 2004.
Please report any broken links or inconsistencies to Shivaram Kalyanakrishnan.