UTCS Reinforcement Learning Reading Group
The UTCS Reinforcement Learning Reading Group is a student-run group that discusses research papers related to reinforcement learning. Ever since its first meeting in the spring of 2004, the group has served as a forum for students to discuss interesting research ideas in an informal setting. Meetings are usually held in the afternoon and refreshments are provided. Occasionally, the group hosts invited talks. The group was coordinated during the period Spring 2004 - Fall 2005 by Matt Taylor. Currently it is managed by Shivaram Kalyanakrishnan.
This page provides information about group meetings. Also, it lists useful resources for reinforcement learning, and serves as a repository of all past readings.
New members are always welcome! Interested students or researchers may also subscribe to the group e-mailing list.
Meeting Time and Place
The group will meet at 11.00 a.m. every other Monday in ENS 31NM. Meeting time and place may change on occasion.
Next Meeting
Yet to be scheduled.
Communication
The reading group has an e-mailing list (rlreadinggroup@utlists.utexas.edu) on which regular announcements are made.
- To subscribe to the list or to unsubscribe from it, send your request through e-mail to shivaram@cs.utexas.edu.
- To send e-mail to the list, address rlreadinggroup@utlists.utexas.edu.
Reinforcement Learning Resources
Suggested Readings for Future Meetings
-
Maximum Entropy Inverse Reinforcement Learning
Brian D. Ziebart, Andrew Maas, J. Andrew Bagnell and Anind K. Dey, 2008
Suggestion from Peter Stone and Kurt Dresner.
-
A Worst-Case Comparison Between Temporal Difference and Residual Gradient with Linear Function Approximation
Lihong Li, 2008
Suggestion from Nick Jong.
-
Learning for Control from Multiple Demonstrations
Adam Coates, Pieter Abbeel and Andrew Y. Ng, 2008
Suggestion from Nick Jong.
-
Learning All Optimal Policies with Multiple Criteria
Leon Barrett and Srini Narayanan, 2008
Suggestion from Nick Jong.
-
Transfer of Samples in Batch Reinforcement Learning
Alessandro Lazaric, Marcello Restelli and Andrea Bonarini, 2008
Suggestion from Nick Jong.
-
Unifying Temporal and Structural Credit Assignment Problems
Adrian K. Agogino and Kagan Tumer, 2004
Suggestion from Shivaram Kalyanakrishnan.
-
On Local Rewards and the Scalability of Distributed Reinforcement Learning
J. Andrew Bagnell and Andrew Y. Ng, 2006
Suggestion from Shivaram Kalyanakrishnan.
-
Multi-Agent Reinforcement Learning in Common Interest and Fixed Sum Stochastic Games: An Experimental Study
Avraham Bab and Ronen I. Brafman, 2008
Suggestion from Peter Stone.
Paper Readings and Talks (Reverse Chronological Order)
Fall 2009
-
Ph.D. Oral Proposal: Practice Talk
Shivaram Kalyanakrishnan, November 2, 2009.
-
The
Adaptive k-Meteorologists Problem and Its Application to Structure Learning and Feature Selection in Reinforcement Learning
Carlos Diuk, Lihong Li and Bethany R. Leffler, 2009
Discussion led by Doran Chakraborty, October 19, 2009.
-
Self-Optimizing
Memory Controllers: A Reinforcement Learning Approach
Engin İpek, Onur Mutlu, José F. Martínez and Rich Caruana, 2008
Discussion led by Matthew Hausknecht, September 28, 2009.
-
Decision theory, reinforcement learning, and the brain
Peter Dayan and Nathaniel D. Daw, 2008
Discussion led by Igor Karpov, September 14, 2009.
Summer 2009
Spring 2009
-
Experiments in Animal Behavior
Presentation by Brad Knox, April 29, 2009.
-
Using Reinforcement Learning to Adapt an Imitation Task
Florent Guenter and Aude G. Billard, 2007
Discussion led by Brad Knox, April 17, 2009.
-
Multi-resolution Exploration in Continuous Spaces
Ali Nouri and Michael L. Littman, 2008
Discussion led by Todd Hester, April 8, 2009.
-
Evolving Neural Networks for Strategic Decision-Making Problems
Nate Kohl and Risto Miikkulainen, 2009
Discussion led by Nate Kohl, April 3, 2009.
-
Gaussian Process Dynamic Programming
Marc Peter Deisenroth, Carl Edward Rasmussen and Jan Peters, 2009
Discussion led by Tobias Jung, March 11, 2009.
-
An Empirical Analysis of Value Function-Based and Policy Search Reinforcement Learning
Shivaram Kalyanakrishnan and Peter Stone, 2009
Discussion led by Shivaram Kalyanakrishnan, March 6, 2009.
Fall 2008
-
Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games
Maria Cutumisu, Duane Szafron, Michael Bowling and Richard S. Sutton, 2008
Discussion led by Jacob Schrum, December 10, 2008.
-
An Analysis of Linear Models, Linear Value-Function Approximation, and Feature Selection for Reinforcement Learning
Ronald Parr, Lihong Li, Gavin Taylor, Christopher Painter-Wakefield
and Michael L. Littman, 2008
Discussion led by Nick Jong, November 26, 2008.
-
An Analysis of Reinforcement Learning with Function Approximation
Francisco S. Melo, Sean P. Meyn and M. Isabel Ribeiro, 2008
Discussion led by Shivaram Kalyanakrishnan, November 12, 2008.
-
Strategy Evaluation in Extensive Games with Importance Sampling
Michael Bowling, Michael Johanson, Neil Burch and Duane Szafron, 2008
Discussion led by Doran Chakraborty, October 29, 2008.
-
HiPPo: Hierarchical POMDPs for Planning Information Processing and Sensing Actions on a Robot
Mohan Sridharan, Jeremy Wyatt and Richard Dearden, 2008
Presentation by Mohan Sridharan, October 22, 2008.
-
Knows What It Knows: A Framework For Self-Aware Learning
Lihong Li, Michael L. Littman and Thomas J. Walsh, 2008
Discussion led by Todd Hester, October 1, 2008.
-
Reinforcement Learning with Limited Reinforcement: Using Bayes Risk for Active Learning in POMDPs
Finale Doshi, Joelle Pineau and Nicholas Roy, 2008
Discussion led by Nick Jong, September 11, 2008.
Summer 2008
-
Planning and Learning in Environments with Delayed Feedback
Thomas J. Walsh, Ali Nouri, Lihong Li and Michael L. Littman, 2007
Discussion led by Shivaram Kalyanakrishnan, June 20, 2008.
-
Adaptive Treatment of Epilepsy via Batch-mode Reinforcement Learning
Arthur Guez, Robert D. Vincent, Massimo Avoli and Joelle Pineau, 2008
Discussion led by Matt Taylor, May 30, 2008.
Spring 2008
-
Perspectives on Reinforcement Learning: Group Discussion
Discussion led by Michael Quinlan, April 25, 2008.
-
Decentralized Control of Cooperative Systems: Categorization and Complexity Analysis
Claudia V. Goldman and Shlomo Zilberstein, 2004
Discussion led by Doran Chakraborty, April 11, 2008.
-
Planning with Durative Actions in Stochastic Domains
Mausam and Daniel S. Weld, 2007
Discussion led by Doran Chakraborty, April 4, 2008.
-
Fourteen Declarative Principles for an Integrative Science of the Temporal Dynamics of Learning
Richard S. Sutton, 2008
Discussion led by Todd Hester, March 21, 2008.
-
Factor-Guided Motion Planning for a Robot Arm
Jaesik Choi and Eyal Amir, 2007
Discussion led by Michael Quinlan, February 29, 2008.
Invited Speaker: Eyal Amir.
-
Learning to Play Using Low-Complexity Rule-Based Policies: Illustrations through Ms. Pac-Man
István Szita and András Lőrincz, 2007
Discussion led by Matt Taylor, February 15, 2008.
-
A Natural Policy Gradient
Sham Kakade, 2002
Discussion led by Andrew Dreher, January 25, 2008.
Fall 2007
-
ICML-07 Tutorial on Bayesian Methods for Reinforcement Learning, Part 4
Pascal Poupart, Mohammad Ghavamzadeh and Yaakov Engel, 2007
Reinforcement learning with Gaussian processes
Yaakov Engel, Shie Mannor and Ron Meir, 2005
Discussion led by Joe Reisinger, December 7, 2007.
-
ICML-07 Tutorial on Bayesian Methods for Reinforcement Learning, Parts 1, 2, 3
Pascal Poupart, Mohammad Ghavamzadeh and Yaakov Engel, 2007
Discussion led by Nick Jong, November 30, 2007.
-
Constructing Basis Functions from Directed Graphs for Value Function Approximation
Jeff Johns and Sridhar Mahadevan, 2007
Proto-Value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes
Sridhar Mahadevan and Mauro Maggioni, 2007
Discussion led by Matt Taylor, November 16, 2007.
-
The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems
Caroline Claus and Craig Boutilier, 1998
Discussion led by Doran Chakraborty, November 9, 2007.
-
Linearly-solvable
Markov Decision Problems
Emanuel Todorov, 2006
Discussion led by Ian Fasel and Shivaram Kalyanakrishnan, October 12, 2007.
-
Combining Online and Offline Knowledge in UCT
Sylvain Gelly and David Silver, 2007
Discussion led by Joe Reisinger, September 21, 2007.
-
Dynamic Positioning in 3D RoboCup Soccer
Presentation by Sahar Asadi, September 7, 2007.
Summer 2007
-
Dirichlet Process Mixtures
Khalid El-Arini, 2005
A Bayesian Framework for Reinforcement Learning
Malcolm Strens, 2000
Discussion led by Nick Jong, June 29, 2007.
-
Multi-Task Reinforcement Learning: A Hierarchical Bayesian Approach
Aaron Wilson, Alan Fern, Soumya Ray and Prasad Tadepalli, 2007
Discussion led by Todd Hester, June 15, 2007.
-
Nash Q-Learning for General-Sum Stochastic Games
Junling Hu and Michael P. Wellman, 2003
Discussion led by Shimon Whiteson, June 8, 2007.
-
Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games
Colin McMillen and Manuela Veloso, 2007
Discussion led by Shivaram Kalyanakrishnan, June 1, 2007.
-
Efficient Reinforcement Learning with Relocatable Action Models
Bethany R. Leffler, Michael L. Littman and Timothy Edmunds, 2007
Discussion led by Matt Taylor, May 25, 2007.
Spring 2007
-
Adaptive Representations for Reinforcement Learning
Shimon Azariah Whiteson, 2007
Ph.D. defense by Shimon Whiteson, April 20, 2007.
-
Integrating Guidance into Relational Reinforcement Learning
Kurt Driessens and Sašo Džeroski, 2004
Discussion led by Andrew Dreher, April 6, 2007.
-
Online Learning and Exploiting Relational Models in Reinforcement Learning
Tom Croonenborghs, Jan Ramon, Hendrick Blockeel and Maurice Bruynooghe, 2007
Discussion led by Matt Taylor, March 23, 2007.
-
State Similarity Based Approach for Improving Performance in RL
Sertan Girgin, Faruk Polat and Reda Alhajj, 2007
Discussion led by Todd Hester, March 9, 2007.
-
Deictic Option Schemas
Balaraman Ravindran, Andrew G. Barto and Vimal Mathew, 2007
Discussion led by Rahul Iyer, March 2, 2007.
-
Bayesian Q-learning
Richard Dearden, Nir Friedman and Stuart Russell, 1998
Discussion led by David Pardoe, February 9, 2007.
-
An Intrinsic Reward Mechanism for Efficient Exploration
Özgür Şimşek and Andrew G. Barto, 2006
Discussion led by Shivaram Kalyanakrishnan, January 26, 2007.
Fall 2006
-
Decision Tree Methods for Finding Reusable MDP Homomorphisms
Alicia Peregrin Wolfe and Andrew G. Barto, 2006
Discussion led by Rahul Iyer, December 4, 2006.
-
Robot planning in partially observable continuous domains
Josep M. Porta, Matthijs T. J. Spaan and Nikos Vlassis, 2005
Discussion led by Igor Karpov, November 20, 2006.
-
Reinforcement Learning in POMDP's via Direct Gradient Ascent
Jonathan Baxter and Peter L. Bartlett, 2000
Discussion led by Yaxin Liu, November 6, 2006.
-
Reinforcement Learning for Optimized Trade Execution
Yuriy Nevmyvaka, Yi Feng and Michael Kearns, 2006
Discussion led by Andrew Dreher, October 23, 2006.
-
Looping Suffix Tree-Based Inference of Partially Observable Hidden State
Michael P. Holmes and Charles Lee Isbell, Jr., 2006
Discussion led by Nick Jong, October 9, 2006.
-
Sparse Cooperative Q-learning
Jelle R. Kok and Nikos Vlassis, 2004
Discussion led by Shivaram Kalyanakrishnan, September 25, 2006.
-
Using Homomorphisms to Transfer Options across Continuous Reinforcement Learning Domains
Vishal Soni and Satinder Singh, 2006
Discussion led by Matt Taylor, September 11, 2006.
Summer 2006
Spring 2006
-
State Space Reduction for Autonomous Reinforcement Learning
Mehran Asadi and Manfred Huber, 2004
Accelerating Action Dependent Hierarchical Reinforcement Learning Through Autonomous Subgoal Discovery
Mehran Asadi and Manfred Huber, 2005
Discussion led by Nick Jong, April 28, 2006.
-
Autonomous Helicopter Flight via Reinforcement Learning
Andrew Y. Ng, H. Jin Kim, Michael I. Jordan and Shankar Sastry, 2004
Discussion led by Shivaram Kalyanakrishnan, April 14, 2006.
-
Policy Gradient Methods for Reinforcement Learning with Function Approximation
Richard S. Sutton, David McAllester, Satinder Singh and Yishay Mansour, 2000
Discussion led by Shimon Whiteson, March 31, 2006.
-
Building Portable Options: Skill Transfer in Reinforcement Learning
George Konidaris and Andrew Barto, 2006
Discussion led by Matt Taylor, March 10, 2006.
-
CBR for State Value Function Approximation in Reinforcement Learning
Thomas Gabel and Martin Riedmiller, 2005
Discussion led by Nick Jong, February 24, 2006.
-
Why (PO)MDPs Lose for Spatial Tasks and What to Do About It
Terran Lane and William D. Smart, 2005
Discussion led by Shivaram Kalyanakrishnan, February 10, 2006.
-
Temporal-Difference Networks
Richard S. Sutton and Brian Tanner, 2005
Discussion led by Bikram Banerjee, January 27, 2006.
Fall 2005
-
Developing navigation behavior through self-organizing distinctive state abstraction
Jefferson Provost, Benjamin J. Kuipers and Risto Miikkulainen, 2006
Discussion led by Jeff Provost, December 14, 2005.
-
An Algorithmic Description of XCS
Martin V. Butz and Stewart W. Wilson, 2000
Discussion led by David Pardoe, December 2, 2005.
-
Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method
Martin Riedmiller, 2005
A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm
Martin Riedmiller and Heinrich Braun, 1993
Discussion led by Shivaram Kalyanakrishnan, November 18, 2005.
-
The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State Spaces
Andrew W. Moore and Christopher G. Atkeson, 1995
Discussion led by Nick Jong, November 4, 2005.
-
Samuel Meets Amarel: Automating Value Function Approximation using Global State Space Analysis
Sridhar Mahadevan, 2005
Discussion led by Jeff Provost, October 14, 2005.
-
Near-Optimal Reinforcement Learning in Polynomial Time
Michael Kearns and Satinder Singh, 1998
Discussion led by Yaxin Liu, September 23, 2005.
Summer 2005
-
An Empirical Evaluation of Interval Estimation for Markov Decision Processes
Alexander L. Strehl and Michael L. Littman, 2004
Discussion led by Shimon Whiteson, August 26, 2005.
-
Using Advice to Transfer Knowledge Acquired in One Reinforcement Learning Task to Another
Lisa Torrey, Trevor Walker, Jude Shavlik and Richard Maclin, 2005
Discussion led by Matt Taylor, August 12, 2005.
-
Guiding Inference through Relational Reinforcement Learning
Nima Asgharbeygi, Negin Nejati, Pat Langley and Sachiyo Arai, 2005
Discussion led by Matt Taylor, June 17, 2005.
-
Least-Squares Policy Iteration
Michail G. Lagoudakis and Ronald Parr, 2003
Discussion led by Lily Mihalkova, May 27, 2005.
-
Relational Reinforcement Learning
Sašo Džeroski, Luc De Raedt and Kurt Driessens, 2001
Discussion led by Greg Kuhlmann, May 6, 2005.
Spring 2005
-
Intrinsically Motivated Reinforcement Learning
Satinder Singh, Andrew G. Barto and Nuttapong Chentanez, 2005
Discussion led by Nick Jong, April 22, 2005.
-
Gradient Descent for General Reinforcement Learning
Leemon Baird and Andrew Moore, 1999
Discussion led by Shimon Whiteson, April 8, 2005.
-
Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks
Chris Drummond, 2002
Discussion led by Matt Taylor, March 25, 2005.
-
A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes
Michael Kearns, Yishay Mansour and Andrew Y. Ng, 2001
Discussion led by David Pardoe, March 4, 2005.
-
No Free Lunch Theorems for Optimization
David H. Wolpert and William G. Macready, 1996
Discussion led by Shimon Whiteson, February 18, 2005.
-
Common Myths and Misstatements about Reinforcement Learning
Various Authors, 1999 onwards
Discussion led by Matt Taylor, February 4, 2005.
Fall 2004
-
Reinforcement Learning as Classification: Leveraging Modern Classifiers
Michail G. Lagoudakis and Ronald Parr, 2003
Discussion led by Nick Jong, November 29, 2004.
-
Discovering Hierarchy in Reinforcement Learning with HEXQ
Bernhard Hengst, 2002
Discussion led by Mazda Ahmadi, 15 November, 2004.
-
Synthesizing Policy Search and Temporal Difference Methods for Reinforcement Learning
Discussion led by Shimon Whiteson , November 1, 2004.
-
Implicit Negotiation in Repeated Games
Michael L. Littman and Peter Stone, 2001
Discussion led by Matt Taylor, October 25, 2004.
-
Markov Games as a Framework for Multi-Agent Reinforcement Learning
Michael L. Littman, 1994
Discussion led by David Pardoe, October 11, 2004.
-
RL Methodology
Discussion led by Lily Mihalkova, September 20, 2004.
Summer 2004
-
Three Automated Stock-Trading Agents: A Comparative Study
Alexander A. Sherstov and Peter Stone, 2004
Practice talk by Sasha Sherstov, July 9, 2004.
Bidding for Customer Orders in TAC SCM
David Pardoe and Peter Stone, 2004
Practice talk by David Pardoe, July 9, 2004.
-
The MAXQ Method for Hierarchical Reinforcement Learning
Thomas G. Dietterich, 1998
Reinforcement Learning: A Survey, Section 6
Leslie Pack Kaelbling, Michael L. Littman and Andrew W. Moore, 1996
Discussion led by Jeff Provost.
-
Acting Optimally in Partially Observable Stochastic Domains
Anthony R. Cassandra, Leslie Pack Kaelbling and Michael L. Littman, 1994
Discussion led by Shimon Whiteson, June 11, 2004.
-
Reinforcement Learning: A Survey, Sections 4 and 5
Leslie Pack Kaelbling, Michael L. Littman and Andrew W. Moore, 1996
Reinforcement Learning: An Introduction, Chapter 9
Richard S. Sutton and Andrew G. Barto, 1998
Discussion on Model free and model based learning led by Peggy Fidelman, May 29, 2004.
-
Residual Algorithms: Reinforcement Learning with Function Approximation
Leemon Baird, 1995
Discussion led by Lily Mihalkova, May 14, 2004.
Spring 2004
-
A Quantitative Study of Hypothesis Selection
Philip W. L. Fong, 1995
Discussion led by Nick Jong, April 23, 2004.
-
Machine Learning for Fast Quadrupedal Locomotion
Nate Kohl and Peter Stone, 2004
Practice Talk by Nate Kohl, April 16, 2004.
-
Policy invariance under reward transformations: Theory and application to reward shaping
Andrew Y. Ng, Daishi Harada and Stuart Russell, 1999
Discussion led by Greg Kuhlmann, April 9, 2004.
-
Reinforcement Learning with Replacing Eligibility Traces
Satinder P. Singh and Richard S. Sutton, 1996
Discussion led by David Pardoe, March 26, 2004.
-
Learning to Predict by the Methods of Temporal Differences
Richard S. Sutton, 1998
Discussion led by Sasha Sherstov, March 5, 2004.
-
On the Complexity of Solving Markov Decision Problems
Michael L. Littman, Thomas L. Dean and Leslie Pack Kaelbling, 1995
Discussion led by Matt Taylor, February 20, 2004.
-
Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces
Juan C. Santamaría, Richard S. Sutton and Ashwin Ram, 1998
Discussion led by Nick Jong, February 6, 2004.
-
Reinforcement Learning: A Survey
Leslie Pack Kaelbling, Michael L. Littman and Andrew W. Moore, 1996
General background discussion, January 23, 2004.
Please report any broken links or inconsistencies to Shivaram Kalyanakrishnan.