Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


RIDM: Reinforced Inverse Dynamics Modeling for Learning from a Single Observed Demonstration

Brahma S. Pavse, Faraz Torabi, Josiah Hanna, Garrett Warnell, and Peter Stone. RIDM: Reinforced Inverse Dynamics Modeling for Learning from a Single Observed Demonstration. In Imitation, Intent, and Interaction (I3) Workshop at ICML 2019, June 2019.
Video of the experiments.

Download

[PDF]461.8kB  

Abstract

Imitation learning has long been an approach to alleviate the tractability issues that arise in reinforcement learning. However, most literature makes several assumptions such as access to the expert's actions, availability of many expert demonstrations, and injection of task-specific domain knowledge into the learning process. We propose reinforced inverse dynamics modeling (RIDM), a method of combining reinforcement learning and imitation from observation (IfO) to perform imitation using a single expert demonstration, with no access to the expert's actions, and with little task-specific domain knowledge. Given only a single set of the expert's raw states, such as joint angles in a robot control task, at each time-step, we learn an inverse dynamics model to produce the necessary low-level actions, such as torques, to transition from one state to the next such that the reward from the environment is maximized. We demonstrate that RIDM outperforms other techniques when we apply the same constraints on the other methods on six domains of the MuJoCo simulator and for two different robot soccer tasks for two experts from the RoboCup 3D simulation league on the SimSpark simulator.

BibTeX Entry

@InProceedings{ICML2019-pavse,
  author = {Brahma S. Pavse and Faraz Torabi and Josiah Hanna and Garrett Warnell and Peter Stone},
  title = {RIDM: Reinforced Inverse Dynamics Modeling for Learning from a Single Observed Demonstration},
  booktitle = {Imitation, Intent, and Interaction (I3) Workshop at ICML 2019},
  location = {Long Beach, California, USA},
  month = {June},
  year = {2019},
  abstract = {
Imitation learning has long been an approach to alleviate the tractability 
issues that arise in reinforcement learning. However, most literature makes 
several assumptions such as access to the expert's actions, availability of 
many expert demonstrations, and injection of task-specific domain knowledge 
into the learning process. We propose reinforced inverse dynamics modeling 
(RIDM), a method of combining reinforcement learning and imitation from 
observation (IfO) to perform imitation using a single expert demonstration, 
with no access to the expert's actions, and with little task-specific domain 
knowledge. Given only a single set of the expert's raw states, such as joint 
angles in a robot control task, at each time-step, we learn an inverse 
dynamics model to produce the necessary low-level actions, such as torques, 
to transition from one state to the next such that the reward from the 
environment is maximized. We demonstrate that RIDM outperforms other 
techniques when we apply the same constraints on the other methods on six 
domains of the MuJoCo simulator and for two different robot soccer tasks for 
two experts from the RoboCup 3D simulation league on the SimSpark simulator.
  },
  wwwnote={<a href="https://sites.google.com/view/ridm-reinforced-inverse-dynami">Video of the experiments</a>.},
}

Generated by bib2html.pl (written by Patrick Riley ) on Wed Jul 15, 2020 21:34:11