Shivaram's Reading List

Function Approximation	Partial Observability	Learning Methods	Ensembles
Stochastic Optimisation	General RL	General ML	Multiagent Learning
Comparison/Integration	Bandits	Applications	Robot Soccer
Humanoids	Parameter	MDP	Empirical
Failure Warning	Representation	General AI	Neural Networks
All

Parameter

Characterizing reinforcement learning methods through parameterized learning problems
Shivaram Kalyanakrishnan and Peter Stone, 2011
Details

On Learning with Imperfect Representations
Shivaram Kalyanakrishnan and Peter Stone, 2011
Details

Protecting Against Evaluation Overfitting in Empirical Reinforcement Learning
Shimon Whiteson, Brian Tanner, Matthew E. Taylor, and Peter Stone, 2011
Details

Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda
Carlton Downey and Scott Sanner, 2010
Details

Toward Off-Policy Learning Control with Function Approximation
Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, and Richard S. Sutton, 2010
Details

Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes
Marek Petrik, Gavin Taylor, Ron Parr, and Shlomo Zilberstein, 2010
Details

The adaptive $k$-meteorologists problem and its application to structure learning and feature selection in reinforcement learning
Carlos Diuk, Lihong Li, and Bethany R. Leffler, 2009
Details

Improving Optimistic Exploration in Model-Free Reinforcement Learning
Marek Grze\'s and Daniel Kudenko, 2009
Details

A Method for Handling Uncertainty in Evolutionary Optimization With an Application to Feedback Control of Combustion
Nikolaus Hansen, André S.P. Niederberger, Lino Guzzella, and Petros Koumoutsakos, 2009
Details

Regularization and feature selection in least-squares temporal difference learning
J. Zico Kolter and Andrew Y. Ng, 2009
Details

Learning Representation and Control in Markov Decision Processes: New Frontiers
Sridhar Mahadevan, 2009
Details

Ontogenetic and Phylogenetic Reinforcement Learning
Julian Togelius, Tom Schaul, Daan Wierstra, Christian Igel, Faustino Gomez, and Jürgen Schmidhuber, 2009
Details

Generalized Domains for Empirical Evaluations in Reinforcement Learning
Shimon Whiteson, Brian Tanner, Matthew E. Taylor, and Peter Stone, 2009
Details

A Theoretical and Empirical Analysis of Expected Sarsa
Harm van Seijen, Hado van Hasselt, Shimon Whiteson, and Marco Wiering, 2009
Details

An empirical evaluation of supervised learning in high dimensions
Rich Caruana, Nikolaos Karampatziakis, and Ainur Yessenalina, 2008
Details

Temporal Difference Updating without a Learning Rate
Marcus Hutter and Shane Legg, 2008
Details

The many faces of optimism: a unifying approach
Istvan Szita and András Lörincz, 2008
Details

SATzilla: Portfolio-based Algorithm Selection for SAT
Lin Xu, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown, 2008
Details

Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark
Martin Riedmiller, Jan Peters, and Stefan Schaal, 2007
Details

An empirical comparison of supervised learning algorithms
Rich Caruana and Alexandru Niculescu-Mizil, 2006
Details

Learning the structure of Factored Markov Decision Processes in reinforcement learning problems
Thomas Degris, Olivier Sigaud, and Pierre-Henri Wuillemin, 2006
Details

Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming
Abraham P. George and Warren B. Powell, 2006
Details

Function Approximation via Tile Coding: Automating Parameter Choice
Alexander A. Sherstov and Peter Stone, 2005
Details

A Robot that Reinforcement-Learns to Identify and Memorize Important Previous Observations
Bram Bakker, Viktor Zhumatiy, Gabriel Gruener, and Jürgen Schmidhuber, 2003
Details

Boosting as a Metaphor for Algorithm Design
Kevin Leyton-Brown, Eugene Nudelman, Galen Andrew, Jim McFadden, and Yoav Shoham, 2003
Details

Using MDP Characteristics to Guide Exploration in Reinforcement Learning
Bohdana Ratitch and Doina Precup, 2003
Details

Characterizing Markov Decision Processes
Bohdana Ratitch and Doina Precup, 2002
Details

A Perspective View and Survey of Meta-Learning
Ricardo Vilalta and Youssef Drissi, 2002
Details

Scaling to Very Very Large Corpora for Natural Language Disambiguation
Michele Banko and Eric Brill, 2001
Details

Random Forests
Leo Breiman, 2001
Details

Convergence of Optimistic and Incremental Q-Learning
Eyal Even-Dar and Yishay Mansour, 2001
Details

Algorithm portfolios
Carla P. Gomes and Bart Selman, 2001
Details

Local Search Algorithms for SAT: An Empirical Evaluation
Holger H. Hoos and Thomas Stützle, 2000
Details

Meta-Learning by Landmarking Various Learning Algorithms
Bernhard Pfahringer, Hilan Bensusan, and Christophe Giraud-Carrier, 2000
Details

An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants
Eric Bauer and Ron Kohavi, 1999
Details

Symposium on Applications of Reinforcement Learning: Final Report for NSF Grant IIS-9810208
Pat Langley and Mark Pendrith, 1998
Details

Experiments with a New Boosting Algorithm
Yoav Freund and Robert E. Schapire, 1996
Details

Incremental Multi-Step Q-Learning
Jing Peng and Ronald J. Williams, 1996
Details

Bagging, Boosting, and C4.5
J. Ross Quinlan, 1996
Details

Recursive Automatic Bias Selection for Classifier Construction
Carla E. Brodley, 1995
Details

Problem Solving with Reinforcement Learning
Gavin Adrian Rummery, 1995
Details

Using a Genetic Algorithm to Search for the Representational Bias of a Collective Reinforcement Learner
Helen G. Cobb and Peter Bock, 1994
Details

An Optimization-based Categorization of Reinforcement Learning Environments
Michael L. Littman, 1993
Details

Interactions between Learning and Evolution
David Ackley and Michael Littman, 1992
Details

Inductive Biases in a Reinforcement Learner
Helen G. Cobb, 1992
Details

Learning from Delayed Rewards
Christopher John Cornish Hellaby Watkins, 1989
Details

How Evaluation Guides AI Research: The Message Still Counts More than the Medium
Paul R. Cohen and Adele E. Howe, 1988
Details

Machine Learning as an Experimental Science
Pat Langley, 1988
Details

Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm
Nick Littlestone, 1987
Details

Brains, Behavior and Robotics
James Sacra Albus, 1981
Details