David Harwath, Wei-Ning Hsu, and James Glass. 2020. Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech. ICLR 2020.
Wei-Ning Hsu, David Harwath, and James Glass. 2019. Transfer Learning from Audio-Visual Grounding to Speech Recognition. Interspeech.
Emmanuel Azuh, David Harwath, and James Glass. 2019. Towards Bilingual Lexicon Discovery From Visually Grounded Speech Audio. Interspeech.
Dídac Surís, Adrià Recasens, David Bau, David Harwath, James Glass, and Antonio Torralba. 2019. Learning Words by Drawing Images. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
David Harwath, Adrià Recasens, Dídac Surís, Antonio Torralba, and James Glass. 2019. Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input. International Journal of Computer Vision.
Awards & Honors
2018 - George M. Sprowls Award for best doctoral thesis in computer science, MIT