Connecting Language and Perception
To truly understand language, an intelligent system must be able to connect words, phrases, and sentences to its perception of objects and events in the world. Ideally, an AI system would be able to learn language like a human child, by being exposed to utterances in a rich perceptual environment. The perceptual context would provide the necessary supervisory information, and learning the connection between language and perception would ground the system's semantic representations in its perception of the world. As a step in this direction, our research is developing systems that learn semantic parsers and language generators from sentences paired only with their perceptual context. It is part of our research on natural language learning. Our research on this topic is supported by the National Science Foundation through grants IIS-0712097 and IIS-1016312.
  • Grounded Language Learning [Video Lecture]
  • Raymond J. Mooney, Invited Talk, AAAI, 2013.
  • Learning Language from its Perceptual Context [Video Lecture]
  • Raymond J. Mooney, Invited Talk, ECML-PKDD, 2008.

Subareas:
Aishwarya Padmakumar Ph.D. Alumni aish [at] cs utexas edu
Jesse Thomason Ph.D. Alumni thomason DOT jesse AT gmail
Subhashini Venugopalan Ph.D. Alumni vsub [at] cs utexas edu
Jordan Voas Ph.D. Student jvoas [at] utexas edu
Harel Yedidsion Postdoctoral Fellow harel [at] cs utexas edu
     [Expand to show all 14][Minimize]
Augmenting Robotic Capabilities through Natural Language 2025
Albert Yu, Ph.D. Proposal.
Mixed-Initiative Dialog for Human-Robot Collaborative Manipulation 2025
Albert Yu, Chengshu Li, Luca Macesanu, Arnav Balaji, Ruchira Ray, Raymond Mooney, Roberto Martín-Martín, Preprint (2025).
Reasoning about Actions with Large Multimodal Models 2025
Vanya Cohen, Ph.D. Proposal.
Temporally Streaming Audio-Visual Synchronization for Real-World Videos 2025
Jordan Voas, Wei-Cheng Tseng, Layne Berry, Xixi Hu, Puyuan Peng, James Stuedemann, and David Harwath, IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2025).
Measuring Sound Symbolism in Audio-visual Models 2024
Wei-Cheng Tseng, Yi-Jen Shih, David Harwath, Raymond Mooney, IEEE Spoken Language Technology (SLT) Workshop (2024).
Multimodal Contextualized Semantic Parsing from Speech 2024
Jordan Voas, Raymond Mooney, David Harwath, Association for Computational Linguistics (ACL) (2024).
Directly Optimizing Evaluation Metrics to Improve Text to Motion 2023
Yili Wang, Masters Thesis, Department of Computer Science, UT Austin.
What is the Best Automated Metric for Text to Motion Generation? 2023
Jordan Voas, Masters Thesis, Department of Computer Science, UT Austin.
Dialog as a Vehicle for Lifelong Learning 2020
Aishwarya Padmakumar, Raymond J. Mooney, In Position Paper Track at the SIGDIAL Special Session on Physically Situated Dialogue (RoboDial 2.0), July 2020.
Systematic Generalization on gSCAN with Language Conditioned Embedding 2020
Tong Gao, Qi Huang and Raymond J. Mooney, In The 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing , December 2020.
Learning a Policy for Opportunistic Active Learning 2018
Aishwarya Padmakumar, Peter Stone, Raymond J. Mooney, In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP-18), Brussels, Belgium, November 2018.
Learning to Connect Language and Perception 2008
Raymond J. Mooney, In Proceedings of the 23rd AAAI Conference on Artificial Intelligence (AAAI), pp. 1598--1601, Chicago, IL, July 2008. Senior Member Paper.
Learning Language Semantics from Ambiguous Supervision 2007
Rohit J. Kate and Raymond J. Mooney, In Proceedings of the 22nd Conference on Artificial Intelligence (AAAI-07), pp. 895-900, Vancouver, Canada, July 2007.
Learning Language from Perceptual Context: A Challenge Problem for AI 2006
Raymond J. Mooney, In Proceedings of the 2006 AAAI Fellows Symposium, Boston, MA, July 2006.