Learning to Interpret Natural Language
Navigation Instructions from Observations

University of Texas at Austin
Department of Computer Sciences
David L. Chen, and Raymond J. Mooney


Map and sample route of one of the virtual worlds

[ Description | Demo | Publication and Talks | Data and Code | Contact ]

Description of the project

Back to Top

The ability to understand natural-language instructions is critical to building intelligent agents that interact with humans. In this project we look at building a system that learns to transform natural-language navigation instructions into executable formal plans. Given no prior linguistic knowledge, the system learns by only observing how humans follow navigation instructions.

The system is trained and evaluated based on the instructor and follower data collected by MacMahon et al. (2006). There are three virtual indoor environments in total. Each environment consists of interconnecting hallways with objects placed at various intersections. There are several different floor patterns as well as wall painting which were used in conjunctiong with the objects for giving directions.

This project is part of our larger effort in developing learning techniques for ground language acquisition. Compared to our earlier project on Learning to Sportscast, this project has a more complex ambiguous supervsion problem. Instead of considering only a handful of possible events referred to by a sportscasting comment, we have to consider an exponential number of navigation plans for each instruction. The interactive nature of the navigation task also allows for more interesting learning scenarios where a human participant is involved.

Demo

Back to Top

Below is an example of a successful parse by our system trained on refined landmarks plan. In addition to the simulation, the parse for each instruction is also shown. Notice that even though it does not correctly parse everything, it captures enough of the meaning to form a sufficient plan.

Example run of our navigation system

Publication and Talks

Back to Top
  • Fast Online Lexicon Learning for Grounded Language Acquisition
    [Abstract] [PDF] [Slides (PPT)]
    David L. Chen
    Annual Meetings of the Association for Computational Linguistics (ACL), 2012

  • Learning to Interpret Natural Language Navigation Instructions from Observations
    [Abstract] [PDF] [Slides (PPT)]
    David L. Chen and Raymond J. Mooney
    AAAI Conference on Artificial Intelligence (AAAI), 2011

  • Panning for Gold: Finding Relevant Semantic Content for Grounded Language Learning
    [Abstract] [PDF] [Poster (PPT)] [Poster (PDF)]
    David L. Chen and Raymond J. Mooney
    Symposium on Machine Learning in Speech and Language Processing (MLSLP), 2011

Data and Code

Back to Top

Overview

The MARCO code and data used in all our experiments were originally produced by Matt MacMahon as described in his AAAI 2006 paper.

There are three environments used (named grid, l, and jelly) with instructions collected from 6 different subjects. The included map files contain information about the layout of the environments and the locations of the objects.

The included MARCO code is a modified version of Matt's original code to facilitate easier usage of the MARCO parser and executor.

Citations

Please use the following citation when referencing the original MARCO code and data:

@InProceedings{macmahon:aaai06,
  title = "Walk the Talk: Connecting Language, Knowledge, and Action in Route Instructions",
  author = "Matt MacMahon and Brian Stankiewicz and Benjamin Kuipers",
  booktitle = "Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-2006)",
  address = "Boston, MA, USA",
  month = "July",
  year = 2006
} 

Please use the following citation when referencing our modified version of the MARCO code and data:
@InProceedings{chen:aaai11,
  title = "Learning to Interpret Natural Language Navigation Instructions fro mObservations",
  author = "David L. Chen and Raymond J. Mooney",
  booktitle = "Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI-2011)",
  address = "San Francisco, CA, USA",
  month = "August",
  year = 2011
} 

The Mandarin Chinese translation of the data was first mentioned in the following paper:
@InProceedings{chen:acl12,
  title = "Fast Online Lexicon Learning for Grounded Language Acquisition",
  author = "David L. Chen",
  booktitle = "Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL-2012)",
  address = "Jeju, Republic of Korea",
  month = "July",
  year = 2012
} 

Downloads

Compressed tarballs of data and code: LearningNavigationInstructions.tgz

You can also browse the data and code here

Contact Information

Back to Top

If you have any questions or comments, please contact David Chen

If you are interested in reading more literature in this area, check out our reading group CLAMP