Shivaram's Reading List


Function Approximation     Partial Observability     Learning Methods     Ensembles    
Stochastic Optimisation     General RL     General ML     Multiagent Learning    
Comparison/Integration     Bandits     Applications     Robot Soccer    
Humanoids     Parameter     MDP     Empirical    
Failure Warning     Representation     General AI     Neural Networks    
All    

Learning Methods

Exploiting Best-Match Equations for Efficient Reinforcement Learning
Harm van Seijen, Shimon Whiteson, Hado van Hasselt, and Marco Wiering, 2011
Details   

Insights in Reinforcement Learning: formal analysis and empirical evaluation of temporal-difference learning algorithms
Hado Philip van Hasselt, 2011
Details   

Relative Entropy Policy Search
Jan Peters, Katharina Mülling, and Yasemin Altün, 2010
Details   

Model-based reinforcement learning with nearly tight exploration complexity bounds
István Szita and Csaba Szepesvári, 2010
Details   

Reinforcement learning of motor skills in high dimensions: A path integral approach
Evangelos Theodorou, Jonas Buchli, and Stefan Schaal, 2010
Details   

The CMA Evolution Strategy: A Tutorial
Nikolaus Hansen, 2009
Details   

Learning motor primitives for robotics
Jens Kober and Jan Peters, 2009
Details   

Efficient covariance matrix update for variable metric evolution strategies
Thorsten Suttorp, Nikolaus Hansen, and Christian Igel, 2009
Details   

A Theoretical and Empirical Analysis of Expected Sarsa
Harm van Seijen, Hado van Hasselt, Shimon Whiteson, and Marco Wiering, 2009
Details   

Incremental Natural Actor-Critic Algorithms
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Ghavamzadeh, and Mark Lee, 2008
Details   

Accelerated Neural Evolution through Cooperatively Coevolved Synapses
Faustino Gomez, Jürgen Schmidhuber, and Risto Miikkulainen, 2008
Details   

Similarities and differences between policy gradient methods and evolution strategies
Verena Heidrich-Meisner and Christian Igel, 2008
Details   

Evolution Strategies for Direct Policy Search
Verena Heidrich-Meisner and Christian Igel, 2008
Details   

Genetic Programming: An Introduction and Tutorial, with a Survey of Techniques and Applications
William B. Langdon, Riccardo Poli, Nicholas Freitag McPhee, and John R. Koza, 2008
Details   

Analysis of an Evolutionary Reinforcement Learning Method in a Multiagent Domain
Jan Hendrik Metzen, Mark Edgington, Yohannes Kassahun, and Frank Kirchner, 2008
Details   

Reinforcement learning of motor skills with policy gradients
Jan Peters and Stefan Schaal, 2008
Details   

Natural Actor-Critic
Jan Peters and Stefan Schaal, 2008
Details   

Sample-based Learning and Search with Permanent and Transient Memories
David Silver, Richard S. Sutton, and Martin Müller, 2008
Details   

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping
Richard S. Sutton, Csaba Szepesvári, Alborz Geramifard, and Michael Bowling, 2008
Details   

Sample Complexity of Policy Search with Known Dynamics
Peter L. Bartlett and Ambuj Tewari, 2007
Details   

Bayesian actor-critic algorithms
Mohammad Ghavamzadeh and Yaakov Engel, 2007
Details   

Bayesian Policy Gradient Algorithms
Mohammad Ghavamzadeh and Yaakov Engel, 2007
Details   

Batch Reinforcement Learning in a Complex Domain
Shivaram Kalyanakrishnan and Peter Stone, 2007
Details   

Large Scale Reinforcement Learning using Q-Sarsa($łambda$) and Cascading Neural Networks
Steffen Nissen, 2007
Details   

Representation Transfer for Reinforcement Learning
Matthew E. Taylor and Peter Stone, 2007
Details   

Adaptive Representations for Reinforcement Learning
Shimon Azariah Whiteson, 2007
Details   

Evolutionary Function Approximation for Reinforcement Learning
Shimon Whiteson and Peter Stone, 2006
Details   

On-line evolutionary computation for reinforcement learning in stochastic domains
Shimon Whiteson and Peter Stone, 2006
Details   

Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method
Martin Riedmiller, 2005
Details   

A Tutorial on the Cross-Entropy Method
Pieter-Tjerk de Boer, Dirk P. Kroese, Shie Mannor, and Reuven Y. Rubinstein, 2005
Details   

Machine Learning for Fast Quadrupedal Locomotion
Nate Kohl and Peter Stone, 2004
Details   

Efficient Evolution of Neural Networks Through Complexification
Kenneth Owen Stanley, 2004
Details   

On Actor-Critic Algorithms
Vijay R. Konda and John N. Tsitsiklis, 2003
Details   

Reinforcement Learning as Classification: Leveraging Modern Classifiers
Michail G. Lagoudakis and Ronald Parr, 2003
Details   

Scaling Internal-State Policy-Gradient Methods for POMDPs
Douglas Aberdeen and Jonathan Baxter, 2002
Details   

Approximately Optimal Approximate Reinforcement Learning
Sham Kakade and John Langford, 2002
Details   

Learning from Scarce Experience
Leonid Peshkin and Christian R. Shelton, 2002
Details   

Infinite-Horizon Policy-Gradient Estimation
Jonathan Baxter and Peter L. Bartlett, 2001
Details   

A Natural Policy Gradient
Sham Kakade, 2001
Details   

Reinforcement Learning in POMDP's via Direct Gradient Ascent
Jonathan Baxter and Peter L. Bartlett, 2000
Details   

Policy Search via Density Estimation
Andrew Y. Ng, Ronald Parr, and Daphne Koller, 2000
Details   

PEGASUS: A policy search method for large MDPs and POMDPs
Andrew Y. Ng and Michael Jordan, 2000
Details   

Policy Gradient Methods for Reinforcement Learning with Function Approximation
Richard S. Sutton, David A. McAllester, Satinder P. Singh, and Yishay Mansour, 2000
Details   

Gradient Descent for General Reinforcement Learning
Leemon Baird and Andrew Moore, 1999
Details   

Solving Non-Markovian Control Tasks with Neuro-Evolution
Faustino J. Gomez and Risto Miikkulainen, 1999
Details   

Evolutionary Algorithms for Reinforcement Learning
David E. Moriarty, Alan C. Schultz, and John J. Grefenstette, 1999
Details   

Robot Shaping: An Experiment in Behavior Engineering
Marco Dorigo and Marco Colombetti, 1998
Details   

Reinforcement Learning: An Introduction
Richard S. Sutton and Andrew G. Barto, 1998
Details   

Neuro-Dynamic Programming
Dimitri P. Bertsekas and John N. Tsitsiklis, 1996
Details   

Reinforcement learning with replacing eligibility traces
Satinder P. Singh and Richard S. Sutton, 1996
Details   

On-line Q-learning using connectionist systems
G. A. Rummery and M. Niranjan, 1994
Details   

Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time
Andrew W. Moore and Christopher G. Atkeson, 1993
Details   

Efficient learning and planning within the Dyna framework
Jing Peng and Ronald J. Williams, 1993
Details   

Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching
Long-Ji Lin, 1992
Details   

Q-Learning
Christopher J. C. H. Watkins and Peter Dayan, 1992
Details   

Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning
Ronald J. Williams, 1992
Details   

Learning Sequential Decision Rules Using Simulation Models and Competition
John J. Grefenstette, Connie Loggia Ramsey, and Alan C. Schultz, 1990
Details   

Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming
Richard S. Sutton, 1990
Details   

Neuronlike adaptive elements that can solve difficult learning control problems
Andrew G. Barto, Richard S. Sutton, and Charles W. Anderson, 1983
Details