Zero-shot Task Adaptation using Natural Language

Zero-shot Task Adaptation using Natural Language (2021)

Prasoon Goyal, Raymond J. Mooney, Scott Niekum

Imitation learning and instruction-following are two common approaches to communicate a user’s intent to a learning agent. However, as the complexity of tasks grows, it may be beneficial to use both demonstrations and language to communicate with an agent. In this work, we propose a novel setting where, given a demonstration for a task (the source task), and a natural language description of the differences between the demonstrated task and a related but different task (the target task), our goal is to train an agent to complete the target task in a zero-shot setting that is, without any demonstrations for the target task. To this end, we introduce Language-Aided Reward and Value Adaptation (LARVA) which, given a source demonstration and a linguistic description of how the target task differs, learns to output either a reward or value function that accurately reflects the target task. Our experiments show that on a diverse set of adaptations, our approach is able to complete more than 95% of target tasks when using template-based descriptions, and more than 70% when using free-form natural language.

View:

PDF, Arxiv

Citation:

Arxiv (2021).

Bibtex:

People

Prasoon Goyal	Ph.D. Alumni	pgoyal [at] cs utexas edu
Raymond J. Mooney	Faculty	mooney [at] cs utexas edu

Areas of Interest

Language and Robotics

Labs

Machine Learning