Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


Learning Curriculum Policies for Reinforcement Learning

Sanmit Narvekar and Peter Stone. Learning Curriculum Policies for Reinforcement Learning. In Proceedings of the 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), May 2019.

Download

[PDF]953.0kB  [slides.pdf]5.6MB  

Abstract

Curriculum learning in reinforcement learning is a training methodology that seeks to speed up learning of a difficult target task, by first training on a series of simpler tasks and transferring the knowledge acquired to the target task. Automatically choosing a sequence of such tasks (i.e., a curriculum) is an open problem that has been the subject of much recent work in this area. In this paper, we build upon a recent method for curriculum design, which formulates the curriculum sequencing problem as a Markov Decision Process. We extend this model to handle multiple transfer learning algorithms, and show for the first time that a curriculum policy over this MDP can be learned from experience. We explore various representations that make this possible, and evaluate our approach by learning curriculum policies for multiple agents in two different domains. The results show that our method produces curricula that can train agents to perform on a target task as fast or faster than existing methods.

BibTeX Entry

@InProceedings{AAMAS19-Narvekar,
  author = {Sanmit Narvekar and Peter Stone},
  title = {Learning Curriculum Policies for Reinforcement Learning},
  booktitle = {Proceedings of the 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},
  location = {Montreal, Canada},
  month = {May},
  year = {2019},
  abstract = {
    Curriculum learning in reinforcement learning is a training methodology that
    seeks to speed up learning of a difficult target task, by first training on a
    series of simpler tasks and transferring the knowledge acquired to the target
    task. Automatically choosing a sequence of such tasks (i.e., a curriculum) is
    an open problem that has been the subject of much recent work in this area.
    In this paper, we build upon a recent method for curriculum design, which
    formulates the curriculum sequencing problem as a Markov Decision Process. We
    extend this model to handle multiple transfer learning algorithms, and show
    for the first time that a curriculum policy over this MDP can be learned from
    experience. We explore various representations that make this possible, and
    evaluate our approach by learning curriculum policies for multiple agents
    in two different domains. The results show that our method produces
    curricula that can train agents to perform on a target task as fast or
    faster than existing methods.
  },
}

Generated by bib2html.pl (written by Patrick Riley ) on Fri Jun 14, 2019 11:36:26