Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


Generalized Model Learning for Reinforcement Learning on a Humanoid Robot

Todd Hester, Michael Quinlan, and Peter Stone. Generalized Model Learning for Reinforcement Learning on a Humanoid Robot. In IEEE International Conference on Robotics and Automation (ICRA), May 2010.
Video available at http://www.cs.utexas.edu/~AustinVilla/?p=research/rl_kick

Download

[PDF]1.5MB  [postscript]25.3MB  

Abstract

Reinforcement learning (RL) algorithms have long been promising methods for enabling an autonomous robot to improve its behavior on sequential decision-making tasks. The obvious enticement is that the robot should be able to improve its own behavior without the need for detailed step-by-step programming. However, for RL to reach its full potential, the algorithms must be sample efficient: they must learn competent behavior from very few real-world trials. From this perspective, model-based methods, which use experiential data more efficiently than model-free approaches, are appealing. But they often require exhaustive exploration to learn an accurate model of the domain. In this paper, we present an algorithm, Reinforcement Learning with Decision Trees (RL-DT), that uses decision trees to learn the model by generalizing the relative effect of actions across states. The agent explores the environment until it believes it has a reasonable policy. The combination of the learning approach with the targeted exploration policy enables fast learning of the model. We compare RL-DT against standard model-free and model-based learning methods, and demonstrate its effectiveness on an Aldebaran Nao humanoid robot scoring goals in a penalty kick scenario.

BibTeX Entry

@InProceedings{ICRA10-hester,
  author="Todd Hester and Michael Quinlan and Peter Stone",
  title="Generalized Model Learning for Reinforcement Learning on a Humanoid Robot",
  booktitle = "{IEEE} International Conference on Robotics and Automation (ICRA)",
  location = "Anchorage, Alaska",
  month = "May",
  year = "2010",
  abstract = "Reinforcement learning (RL) algorithms have long been
	promising methods for enabling an autonomous robot to improve its
	behavior on sequential decision-making tasks. The obvious enticement is
	that the robot should be able to improve its own behavior without the
	need for detailed step-by-step programming. However, for RL to reach its
	full potential, the algorithms must be sample efficient: they must learn
	competent behavior from very few real-world trials. From this
	perspective, model-based methods, which use experiential data more
	efficiently than model-free approaches, are appealing. But they often
	require exhaustive exploration to learn an accurate model of the domain.
	In this paper, we present an algorithm, Reinforcement Learning with
	Decision Trees (RL-DT), that uses decision trees to learn the model by
	generalizing the relative effect of actions across states. The agent
	explores the environment until it believes it has a reasonable policy.
	The combination of the learning approach with the targeted exploration
	policy enables fast learning of the model. We compare RL-DT against
	standard model-free and model-based learning methods, and demonstrate
	its effectiveness on an Aldebaran Nao humanoid robot scoring goals in a
	penalty kick scenario.",
  wwwnote={Video available at <a href="http://www.cs.utexas.edu/~AustinVilla/?p=research/rl_kick">http://www.cs.utexas.edu/~AustinVilla/?p=research/rl_kick</a>},
}

Generated by bib2html.pl (written by Patrick Riley ) on Fri Sep 05, 2014 12:17:35