Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


Interactively Shaping Agents via Human Reinforcement: The TAMER Framework

W. Bradley Knox and Peter Stone. Interactively Shaping Agents via Human Reinforcement: The TAMER Framework. In The Fifth International Conference on Knowledge Capture, September 2009.
The TAMER project page with videos of TAMER in action.
K-CAP 2009

Download

[PDF]540.2kB  [postscript]3.7MB  

Abstract

As computational learning agents move into domains that incur realcosts (e.g., autonomous driving or financial investment), it will benecessary to learn good policies without numerous high-cost learningtrials. One promising approach to reducing sample complexity oflearning a task is knowledge transfer from humans to agents. Ideally,methods of transfer should be accessible to anyone with taskknowledge, regardless of that person's expertise in programming andAI. This paper focuses on allowing a human trainer to interactivelyshape an agent's policy via reinforcement signals. Specifically, thepaper introduces ``Training an Agent Manually via EvaluativeReinforcement,'' or TAMER, a framework that enables such shaping.Differing from previous approaches to interactive shaping, a TAMERagent models the human's reinforcement and exploits its model bychoosing actions expected to be most highly reinforced. Results fromtwo domains demonstrate that lay users can train TAMER agentswithout defining an environmental reward function (as in an MDP)and indicate that human training within the TAMER frameworkcan reduce sample complexity over autonomous learning algorithms.

BibTeX Entry

@InProceedings{KCAP09-knox,
 author="W.~Bradley Knox and Peter Stone",
 title="Interactively Shaping Agents via Human Reinforcement: The {TAMER} Framework",
 booktitle="The Fifth International Conference on Knowledge Capture",
 month="September",
 year="2009",
 abstract={As computational learning agents move into domains that incur real
costs (e.g., autonomous driving or financial investment), it will be
necessary to learn good policies without numerous high-cost learning
trials. One promising approach to reducing sample complexity of
learning a task is knowledge transfer from humans to agents. Ideally,
methods of transfer should be accessible to anyone with task
knowledge, regardless of that person's expertise in programming and
AI. This paper focuses on allowing a human trainer to interactively
shape an agent's policy via reinforcement signals. Specifically, the
paper introduces ``Training an Agent Manually via Evaluative
Reinforcement,'' or TAMER, a framework that enables such shaping.
Differing from previous approaches to interactive shaping, a TAMER
agent models the human's reinforcement and exploits its model by
choosing actions expected to be most highly reinforced. Results from
two domains demonstrate that lay users can train TAMER agents
without defining an environmental reward function (as in an MDP)
and indicate that human training within the TAMER framework
can reduce sample complexity over autonomous learning algorithms.
},
 wwwnote={The <a href="http://www.cs.utexas.edu/~bradknox/TAMER.html">TAMER</a> project page with <a href="http://www.cs.utexas.edu/~bradknox/TAMER_in_Action.html">videos</a> of TAMER in action.<br><a href="http://kcap09.stanford.edu/">K-CAP 2009</a>},
}

Generated by bib2html.pl (written by Patrick Riley ) on Wed Sep 24, 2014 22:15:11