| Function Approximation |   |   | Partial Observability |   |   | Learning Methods |   |   | Ensembles |   |   | 
| Stochastic Optimisation |   |   | General RL |   |   | General ML |   |   | Multiagent Learning |   |   | 
| Comparison/Integration |   |   | Bandits |   |   | Applications |   |   | Robot Soccer |   |   | 
| Humanoids |   |   | Parameter |   |   | MDP |   |   | Empirical |   |   | 
| Failure Warning |   |   | Representation |   |   | General AI |   |   | Neural Networks |   |   | 
| All |   |   | 
 SarsaLandmark: an algorithm for learning in POMDPs with landmarks
 Michael R. James and  Satinder Singh, 2009
    Details   
 Learning to Use Working Memory in Partially Observable Environments through Dopaminergic Reinforcement
 Michael T. Todd,  Yael Niv, and  Jonathan D. Cohen, 2009
    Details   
 Analysis of an Evolutionary Reinforcement Learning Method in a Multiagent Domain
 Jan Hendrik Metzen,  Mark Edgington,  Yohannes Kassahun, and  Frank Kirchner, 2008
    Details   
 Looping suffix tree-based inference of partially observable hidden state
 Michael P. Holmes and  Charles Lee Isbell, Jr, 2006
    Details   
 Anytime Point-Based Approximations for Large POMDPs
 Joelle Pineau,  Geoffrey J. Gordon, and  Sebastian Thrun, 2006
    Details   
 Scaling Internal-State Policy-Gradient Methods for POMDPs
 Douglas Aberdeen and  Jonathan Baxter, 2002
    Details   
 An $epsilon$-Optimal Grid-Based Algorithm for Partially Observable Markov Decision Processes
 Blai Bonet, 2002
    Details   
 On the Existence of Fixed Points for Q-Learning and Sarsa in Partially Observable Domains
 Theodore J. Perkins and  Mark D. Pendrith, 2002
    Details   
 Reinforcement Learning for POMDPs Based on Action Values and Stochastic Optimization
 Theodore J. Perkins, 2002
    Details   
 Evolutionary Search, Stochastic Policies with Memory, and Reinforcement Learning with Hidden State
 Matthew R. Glickman and  Katia Sycara, 2001
    Details   
 Value-Function Approximations for Partially Observable Markov Decision Processes
 Milos Hauskrecht, 2000
    Details   
 Monte Carlo POMDPs
 Sebastian Thrun, 2000
    Details   
 Reinforcement Learning Using Approximate Belief States
 Andrés Rodríguez,  Ronald Parr, and  Daphne Koller, 1999
    Details   
 Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes
 John Loch and  Satinder Singh, 1998
    Details   
 An Analysis of Direct Reinforcement Learning in Non-Markovian Domains
 Mark D. Pendrith and  Michael J. McGarity, 1998
    Details   
 Reinforcement Learning: An Introduction
 Richard S. Sutton and  Andrew G. Barto, 1998
    Details   
 Reinforcement Learning with Selective Perception and Hidden State
 Andrew Kachites McCallum, 1996
    Details   
 Reinforcement learning with replacing eligibility traces
 Satinder P. Singh and  Richard S. Sutton, 1996
    Details   
 Reinforcement Learning Algorithm for Partially Observable Markov Problems
 Tommi Jaakkola,  Satinder P. Singh, and  Michael I. Jordan, 1995
    Details   
 Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State
 R. Andrew McCallum, 1995
    Details   
 Approximating Optimal Policies for Partially Observable Stochastic Domains
 Ronald Parr and  Stuart Russell, 1995
    Details   
 Acting optimally in partially observable stochastic domains
 Anthony R. Cassandra,  Leslie Pack Kaelbling, and  Michael L. Littman, 1994
    Details   
 Learning Without State-Estimation in Partially Observable Markovian Decision Processes
 Satinder P. Singh,  Tommi Jaakkola, and  Michael I. Jordan, 1994
    Details   
 Reinforcement learning with hidden states
 Long-Ji Lin and  Tom M. Mitchell, 1993
    Details   
 Overcoming Incomplete Perception with Utile Distinction Memory
 R. Andrew McCallum, 1993
    Details   
 Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach
 Lonnie Chrisman, 1992
    Details   
 Cost-Sensitive Reinforcement Learning for Adaptive Classification and Control
 Ming Tan, 1991
    Details   
 Learning to perceive and act by trial and error
 Steven D. Whitehead and  Dana H. Ballard, 1991
    Details   
 A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms
 George E. Monahan, 1982
    Details   
 The Optimal Control of Partially Observable Markov Processes Over the Infinite Horizon: Discounted Costs
 Edward J. Sondik, 1978
    Details   
 Optimal Control of Markov Processes with Incomplete State Information
 K. J. Åström, 1965
    Details