Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


Compositional Models for Reinforcement Learning

Compositional Models for Reinforcement Learning.
Nicholas K. Jong and Peter Stone.
In The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, September 2009.

Download

[PDF]173.0kB  [postscript]431.7kB  

Abstract

Innovations such as optimistic exploration, function approximation, and hierarchical decomposition have helped scale reinforcement learning to more complex environments, but these three ideas have rarely been studied together. This paper develops a unified framework that formalizes these algorithmic contributions as operators on learned models of the environment. Our formalism reveals some synergies among these innovations, and it suggests a straightforward way to compose them. The resulting algorithm, Fitted R-MAXQ, is the first to combine the function approximation of fitted algorithms, the efficient model-based exploration of R-MAX, and the hierarchical decompostion of MAXQ.

BibTeX Entry

@InProceedings{ECML09-jong,
    author="Nicholas K. Jong and Peter Stone",
    title="Compositional Models for Reinforcement Learning",
    booktitle="The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases",
    month="September",year="2009",
    abstract={Innovations such as optimistic exploration, function
                 approximation, and hierarchical decomposition have
                 helped scale reinforcement learning to more complex
                 environments, but these three ideas have rarely been
                 studied together.  This paper develops a unified
                 framework that formalizes these algorithmic
                 contributions as operators on learned models of the
                 environment.  Our formalism reveals some synergies
                 among these innovations, and it suggests a
                 straightforward way to compose them.  The resulting
                 algorithm, Fitted R-MAXQ, is the first to combine
                 the function approximation of fitted algorithms, the
                 efficient model-based exploration of R-MAX, and the
                 hierarchical decompostion of MAXQ.},
}

Generated by bib2html.pl (written by Patrick Riley ) on Mon Mar 25, 2024 00:05:16