Peter Stone's Selected Publications

• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •

Design Principles for Creating Human-Shapable Agents

Design Principles for Creating Human-Shapable Agents.
W. Bradley Knox, Ian Fasel, and Peter Stone.
In AAAI Spring 2009 Symposium on Agents that Learn from Human Teachers, March 2009.
AAAI Spring 2009 Symposium: Agents that Learn from Human Teachers

Download

[PDF]400.8kB [postscript]2.1MB

Abstract

In order for learning agents to be useful to non-technical users, it is important to be able to teach agents how to perform new tasks using simple communication methods. We begin this paper by describing a framework we recently developed called Training an Agent Manually via Evaluative Reinforcement (TAMER), which allows a human to train a learning agent by giving simple scalar reinforcement\footnoteIn this paper, we distinguish between human reinforcement and environmental reward within an MDP. To avoid confusion, human feedback is always called ``reinforcement''. signals while observing the agent perform the task. We then discuss how this work fits into a general taxonomy of methods for human-teachable (HT) agents and argue that the entire field of HT agents could benefit from an increased focus on the \em human side of teaching interactions. We then propose a set of conjectures about aspects of human teaching behavior that we believe could be incorporated into future work on HT agents.

BibTeX Entry

@InProceedings{AAAIsymp09-knox,
 author="W.\ Bradley Knox and Ian Fasel and Peter Stone",
 title="Design Principles for Creating Human-Shapable Agents",
 booktitle="AAAI Spring 2009 Symposium on Agents that Learn from Human Teachers",
 month="March",
 year="2009",
 abstract={In order for learning agents to be useful to non-technical users, it
  is important to be able to teach agents how to perform new tasks using
  simple communication methods. We begin this paper by describing a
  framework we recently developed called Training an Agent Manually via
  Evaluative Reinforcement (TAMER), which allows a human to train a
  learning agent by giving simple scalar reinforcement\footnote{In this
  paper, we distinguish between human reinforcement and environmental
  reward within an MDP. To avoid confusion, human feedback is always
  called ``reinforcement''.} signals while observing the agent perform
  the task. We then discuss how this work fits into a general taxonomy
  of methods for human-teachable (HT) agents and argue that the entire
  field of HT agents could benefit from an increased focus on the {\em
  human} side of teaching interactions.  We then propose a set of
  conjectures about aspects of human teaching behavior that we believe
  could be incorporated into future work on HT agents.},
 wwwnote={<a href="http://www.aaai.org/Symposia/Spring/sss09.php">AAAI Spring 2009 Symposium: Agents that Learn from Human Teachers</a>},
}

Generated by bib2html.pl (written by Patrick Riley ) on Sat Nov 15, 2025 21:30:22