Peter Stone's Selected Publications

• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •

Intrinsically motivated model learning for developing curious robots

Intrinsically motivated model learning for developing curious robots.
Todd Hester and Peter Stone.
Artificial Intelligence, 247:170–86, June 2017.
from journal website.

Download

[PDF]1.4MB

Abstract

Reinforcement Learning (RL) agents are typically deployed to learn a specific, concrete task based on a pre-defined reward function. However, in some cases an agent may be able to gain experience in the domain prior to being given a task. In such cases, intrinsic motivation can be used to enable the agent to learn a useful model of the environment that is likely to help it learn its eventual tasks more efficiently. This paradigm fits robots particularly well, as they need to learn about their own dynamics and affordances which can be applied to many different tasks. This article presents the texplore with Variance-And-Novelty-Intrinsic-Rewards algorithm (texplore-vanir), an intrinsically motivated model-based RL algorithm. The algorithm learns models of the transition dynamics of a domain using random forests. It calculates two different intrinsic motivations from this model: one to explore where the model is uncertain, and one to acquire novel experiences that the model has not yet been trained on. This article presents experiments demonstrating that the combination of these two intrinsic rewards enables the algorithm to learn an accurate model of a domain with no external rewards and that the learned model can be used afterward to perform tasks in the domain. While learning the model, the agent explores the domain in a developing and curious way, progressively learning more complex skills. In addition, the experiments show that combining the agent's intrinsic rewards with external task rewards enables the agent to learn faster than using external rewards alone. We also present results demonstrating the applicability of this approach to learning on robots.

BibTeX Entry

@article{AIJ15-Hester,
AUTHOR={Todd Hester and Peter Stone},     
TITLE={Intrinsically motivated model learning for developing curious robots},
JOURNAL={Artificial Intelligence},
YEAR={2017},
month={June},
pages={170--86},
volume=247,
URL={http://www.sciencedirect.com/science/article/pii/S0004370215000764},
DOI={10.1016/j.artint.2015.05.002},
ISSN={},
ABSTRACT={
          Reinforcement Learning (RL) agents are typically deployed to
          learn a specific, concrete task based on a pre-defined
          reward function. However, in some cases an agent may be able
          to gain experience in the domain prior to being given a
          task. In such cases, intrinsic motivation can be used to
          enable the agent to learn a useful model of the environment
          that is likely to help it learn its eventual tasks more
          efficiently. This paradigm fits robots particularly well, as
          they need to learn about their own dynamics and affordances
          which can be applied to many different tasks. This article
          presents the texplore with
          Variance-And-Novelty-Intrinsic-Rewards algorithm
          (texplore-vanir), an intrinsically motivated model-based RL
          algorithm. The algorithm learns models of the transition
          dynamics of a domain using random forests. It calculates two
          different intrinsic motivations from this model: one to
          explore where the model is uncertain, and one to acquire
          novel experiences that the model has not yet been trained
          on. This article presents experiments demonstrating that the
          combination of these two intrinsic rewards enables the
          algorithm to learn an accurate model of a domain with no
          external rewards and that the learned model can be used
          afterward to perform tasks in the domain. While learning the
          model, the agent explores the domain in a developing and
          curious way, progressively learning more complex skills. In
          addition, the experiments show that combining the agent's
          intrinsic rewards with external task rewards enables the
          agent to learn faster than using external rewards alone. We
          also present results demonstrating the applicability of this
          approach to learning on robots.
},
  wwwnote={<a href="http://www.sciencedirect.com/science/article/pii/S0004370215000764">from journal website.</a>}
}

Generated by bib2html.pl (written by Patrick Riley ) on Fri Jun 20, 2025 08:27:15