David Harwath's research interests are in the areas of automatic speech recognition, spoken language understanding, and multi-modal machine learning. His work aims to develop models of speech and language that are robust, flexible, and capable of learning on the fly from multiple input modalities. He holds a B.S. in Electrical Engineering from the University of Illinois at Urbana-Champaign, a S.M. in Computer Science from MIT, and a Ph.D. in Computer Science from MIT.
Automatic speech recognition, spoken language understanding, multi-modal and embodied machine learning (speech, environmental sound, vision)
David Harwath, Wei-Ning Hsu, and James Glass. 2020. Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech. ICLR 2020.
Wei-Ning Hsu, David Harwath, and James Glass. 2019. Transfer Learning from Audio-Visual Grounding to Speech Recognition. Interspeech.
Emmanuel Azuh, David Harwath, and James Glass. 2019. Towards Bilingual Lexicon Discovery From Visually Grounded Speech Audio. Interspeech.
Dídac Surís, Adrià Recasens, David Bau, David Harwath, James Glass, and Antonio Torralba. 2019. Learning Words by Drawing Images. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
David Harwath, Adrià Recasens, Dídac Surís, Antonio Torralba, and James Glass. 2019. Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input. International Journal of Computer Vision.