PixL2R: Guiding Reinforcement Learning using Natural Language by Mapping Pixels to Rewards

PixL2R: Guiding Reinforcement Learning using Natural Language by Mapping Pixels to Rewards (2020)

Prasoon Goyal, Scott Niekum, Raymond J. Mooney

Reinforcement learning (RL), particularly in sparse reward settings, often requires prohibitively large numbers of interactions with the environment, thereby limiting its applicability to complex problems. To address this, several prior approaches have used natural language to guide the agent's exploration. However, these approaches typically operate on structured representations of the environment, and/or assume some structure in the natural language commands. In this work, we propose a model that directly maps pixels to rewards, given a free-form natural language description of the task, which can then be used for policy training. Our experiments on the Meta-World robot manipulation domain show that language-based rewards significantly improve learning. Further, we analyze the resulting framework using multiple ablation experiments to better understand the nature of these improvements.

View:

PDF, Arxiv

Citation:

In 4th Conference on Robot Learning (CoRL), November 2020. Also presented on the 1st Language in Reinforcement Learning (LaReL) Workshop at ICML, July 2020 (Best Paper Award), the 6th Deep Reinforcement Learning Workshop at Neural Information Processing Systems (NeurIPS), Dec 2020.

Bibtex:

People

Prasoon Goyal	Ph.D. Alumni	pgoyal [at] cs utexas edu
Prasoon Goyal	Ph.D. Alumni	pgoyal [at] cs utexas edu
Raymond J. Mooney	Faculty	mooney [at] cs utexas edu
Scott Niekum	Faculty	sniekum [at] cs utexas edu

Areas of Interest

Language and Robotics Reinforcement Learning

Labs

Machine Learning The Personal Autonomous Robotics Lab (PeARL)