Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


Model-Based Function Approximation for Reinforcement Learning

Nicholas K. Jong and Peter Stone. Model-Based Function Approximation for Reinforcement Learning. In The Sixth International Joint Conference on Autonomous Agents and Multiagent Systems, May 2007.

Download

[PDF]321.1kB  [postscript]1.0MB  

Abstract

Reinforcement learning promises a generic method for adapting agents to arbitrary tasks in arbitrary stochastic environments, but applying it to new real-world problems remains difficult, a few impressive success stories notwithstanding. Most interesting agent-environment systems have large state spaces, so performance depends crucially on efficient generalization from a small amount of experience. Current algorithms rely on model-free function approximation, which estimates the long-term values of states and actions directly from data and assumes that actions have similar values in similar states. This paper proposes model-based function approximation, which combines two forms of generalization by assuming that in addition to having similar values in similar states, actions also have similar effects. For one family of generalization schemes known as averagers, computation of an approximate value function from an approximate model is shown to be equivalent to the computation of the exact value function for a finite model derived from data. This derivation both integrates two independent sources of generalization and permits the extension of model-based techniques developed for finite problems. Preliminary experiments with a novel algorithm, AMBI (Approximate Models Based on Instances), demonstrate that this approach yields faster learning on some standard benchmark problems than many contemporary algorithms.

BibTeX Entry

@InProceedings{AAMAS07-jong,
        author="Nicholas K. Jong and Peter Stone",
        title="Model-Based Function Approximation for Reinforcement Learning",
        booktitle="The Sixth International Joint Conference on Autonomous Agents and  Multiagent Systems",
       month="May",year="2007",
       abstract={
                 Reinforcement learning promises a generic method for
                 adapting agents to arbitrary tasks in arbitrary
                 stochastic environments, but applying it to new
                 real-world problems remains difficult, a few
                 impressive success stories notwithstanding.  Most
                 interesting agent-environment systems have large
                 state spaces, so performance depends crucially on
                 efficient generalization from a small amount of
                 experience.  Current algorithms rely on model-free
                 function approximation, which estimates the long-term
                 values of states and actions directly from data and
                 assumes that actions have similar values in similar
                 states. This paper proposes model-based function
                 approximation, which combines two forms of
                 generalization by assuming that in addition to having
                 similar values in similar states, actions also have
                 similar effects.  For one family of generalization
                 schemes known as averagers, computation of an
                 approximate value function from an approximate model
                 is shown to be equivalent to the computation of the
                 exact value function for a finite model derived from
                 data.  This derivation both integrates two
                 independent sources of generalization and permits the
                 extension of model-based techniques developed for
                 finite problems.  Preliminary experiments with a
                 novel algorithm, AMBI (Approximate Models Based on
                 Instances), demonstrate that this approach yields
                 faster learning on some standard benchmark problems
                 than many contemporary algorithms.
       },
}

Generated by bib2html.pl (written by Patrick Riley ) on Wed Jul 09, 2014 11:54:43