Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


Model-Based Exploration in Continuous State Spaces

Nicholas K. Jong and Peter Stone. Model-Based Exploration in Continuous State Spaces. In The Seventh Symposium on Abstraction, Reformulation, and Approximation, July 2007.

Download

[PDF]324.3kB  [postscript]1.1MB  

Abstract

Modern reinforcement learning algorithms effectively exploit experience data sampled from an unknown controlled dynamical system to compute a good control policy, but to obtain the necessary data they typically rely on naive exploration mechansisms or human domain knowledge. Approaches that first learn a model offer improved exploration in finite problems, but discrete model representations do not extend directly to continuous problems. This paper develops a method for approximating continuous models by fitting data to a finite sample of states, leading to finite representations compatible with existing model-based exploration mechanisms. Experiments with the resulting family of fitted-model reinforcement learning algorithms reveals the critical importance of how the continuous model is generalized from finite data. This paper demonstrates instantiations of fitted-model algorithms that lead to faster learning on benchmark problems than contemporary model-free RL algorithms that only apply generalization in estimating action values. Finally, the paper concludes that in continuous problems, the exploration-exploitation tradeoff is better construed as a balance between exploration and generalization.

BibTeX Entry

@InProceedings{SARA07-jong,
       author="Nicholas K. Jong and Peter Stone",
       title="Model-Based Exploration in Continuous State Spaces",
       booktitle="The Seventh Symposium on Abstraction, Reformulation, and Approximation",
       month="July",year="2007",
       abstract={
                 Modern reinforcement learning algorithms effectively
                 exploit experience data sampled from an unknown
                 controlled dynamical system to compute a good control
                 policy, but to obtain the necessary data they
                 typically rely on naive exploration mechansisms or
                 human domain knowledge.  Approaches that first learn
                 a model offer improved exploration in finite
                 problems, but discrete model representations do not
                 extend directly to continuous problems.  This paper
                 develops a method for approximating continuous models
                 by fitting data to a finite sample of states, leading
                 to finite representations compatible with existing
                 model-based exploration mechanisms. Experiments with
                 the resulting family of fitted-model reinforcement
                 learning algorithms reveals the critical importance
                 of how the continuous model is generalized from
                 finite data.  This paper demonstrates instantiations
                 of fitted-model algorithms that lead to faster
                 learning on benchmark problems than contemporary
                 model-free RL algorithms that only apply
                 generalization in estimating action values.  Finally,
                 the paper concludes that in continuous problems, the
                 exploration-exploitation tradeoff is better construed
                 as a balance between exploration and generalization.
                 },
}

Generated by bib2html.pl (written by Patrick Riley ) on Thu Dec 11, 2014 23:22:58