CS 395T:
Grounded Natural Language Processing

How to read research articles

Background papers recommended by Matt Lease.
  1. S. Keshav. How to Read a Paper. U. Waterloo, February 17, 2016.
  2. Alan Smith. 1990. The Task of the Referee.

Research Papers

Papers to be read and presented by students. Papers for "pair" presentations from 2 students start with a "[2]". A presentation date is given at the beginning for each paper.
  1. [Presented by William M. on Feb. 1] PRESENTATION Harnad, S., The Symbol Grounding Problem Physica D 42: 335-346, 1990.
  2. [Presented by Daniel A. and Shivang S. on Feb. 1] PRESENTATION [2] Tadas Baltrusaitis, Chaitanya Ahuja, Louis-Philippe Morency, Multimodal Machine Learning: A Survey and Taxonomy, 2017.
  3. [Presented by Ojas P. on Feb. 3] PRESENTATION Matt MacMahon, Brian Stankiewicz, and Benjamin Kuipers, Walk the Talk: Connecting Language, Knowledge, and Action in Route Instructions, AAAI, 2006.
  4. [Presented by Yijin Z. on Feb. 3] PRESENTATION David L. Chen and Raymond J. Mooney, Learning to Interpret Natural Language Navigation Instructions from Observations, In Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI), 859-865, August 2011.
  5. [Presented by William M. on Feb. 8] PRESENTATION Hongyuan Mei, Mohit Bansal, and Matthew R. Walter. Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, In Proceedings of the National Conference on Artificial Intelligence (AAAI), 2016.
  6. [Presented by Ryo K. on Feb. 8] PRESENTATION Howard Chen, Alane Suhr, Dipendra Misra, Noah Snavely, Yoav Artzi, TOUCHDOWN: Natural Language Navigation and Spatial Reasoning in Visual Street Environments, CVPR, 2019
  7. [Presented by Ojas P. on Feb. 10] PRESENTATION Mohit Shridhar, Jesse Thomason, Daniel Gordon, Yonatan Bisk, Winson Han, Roozbeh Mottaghi, Luke Zettlemoyer, Dieter Fox, ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks, CVPR 2020.
  8. [Presented by William M. and Ojas P. on Feb. 10] PRESENTATION [2] Md. Zakir Hossain, Ferdous Sohel, Mohd Fairuz Shiratuddin, Hamid Laga, A Comprehensive Survey of Deep Learning for Image Captioning, ACM Computing Surveys (October 2018).
  9. [Presented by Ryo K. on Feb. 24] PRESENTATION Xin Wang, Wenhu Chen, Yuan-Fang Wang, William Yang Wang. No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), full paper, Melbourne, Australia, July 15-20, 2018
  10. [Presented by Kiran R. on Feb. 24] PRESENTATION Subhashini Venugopalan and Marcus Rohrbach and Jeff Donahue and Raymond J. Mooney and Trevor Darrell and Kate Saenko, Sequence to Sequence -- Video to Text, In Proceedings of the 2015 International Conference on Computer Vision (ICCV-15), Santiago, Chile, December 2015.
  11. [Presented by Daniel A. on Mar. 1] PRESENTATION Xin Wang, Jiawei Wu, Junkun Chen, Lei Li, Yuan-Fang Wang, and William Yang Wang, VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research , Proceedings of the 17th CVF/IEEE International Conference on Computer Vision (ICCV 2019), Seoul, Korea.
  12. [Presented by Yijin Z. and Ryo K. on Mar. 1] PRESENTATION [2] Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh, VQA: Visual Question Answering, International Conference on Computer Vision (ICCV), 2015.
  13. [Presented by Daniel A. on Mar. 3] PRESENTATION Jacob Andreas, Marcus Rohrbach, Trevor Darrell, Dan Klein, Learning to Compose Neural Networks for Question Answering, NAACL 2016.
  14. [Presented by Jay L. on Mar. 3] PRESENTATION Peter Anderson, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould, Lei Zhang, Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering, CVPR, 2018.
  15. [Presented by Bill Y. on Mar. 8] PRESENTATION Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal, TVQA+: Spatio-Temporal Grounding for Video Question Answering>, ACL 2020.
  16. [Presented by Jay L. on Mar. 8] PRESENTATION Xintong Yu, Hongming Zhang, Yangqiu Song, Yan Song, and Changshui Zhang, What You See is What You Get:Visual Pronoun Coreference Resolution in Dialogues, EMNLP, 2019
  17. [Presented by Kiran R. and Andrei A. on Mar. 10] PRESENTATION [2] Carina Silberer, Vittorio Ferrari, and Mirella Lapata. Visually Grounded Meaning Representations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39:11, 2284--2297, 2017.
  18. [Presented by Andrei A. on Mar. 10] PRESENTATION Jelena Luketina, Nantas Nardelli, Gregory Farquhar, Jakob Foerster, Jacob Andreas, Edward Grefenstette, Shimon Whiteson, Tim Rocktäschel. A Survey of Reinforcement Learning Informed by Natural Language. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI 2019).
  19. [Presented by Yijin Z. on Mar. 22] PRESENTATION Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee, Generative Adversarial Text to Image Synthesis, ICML 2016.
  20. [Presented by Bill Y. and Jay L. on Mar. 22] PRESENTATION [2] David Harwath, Adria Recasens, Didac Suris, Galen Chuang, Antonio Torralba, and James Glass, Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input, Proceedings of the European Conference on Computer Vision (ECCV), 2018.
  21. [Presented by Bill Y. on Mar. 24] PRESENTATIONJiasen Lu, Dhruv Batra, Devi Parikh, Stefan Lee. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks, NeurIPS, 2019.
  22. [Presented by Shivang S. on Mar. 24] PRESENTATION Zhe Gan, Yen-Chun Chen, Linjie Li, Chen Zhu, Yu Cheng, Jingjing Liu, Large-Scale Adversarial Training for Vision-and-Language Representation Learning, NeurIPS, 2020.
  23. [Presented by Kiran R. on Mar. 29] PRESENTATION Lisa Anne Hendricks, Zeynep Akata, Marcus Rohrbach, Jeff Donahue, Bernt Schiele, Trevor Darrell, Generating Visual Explanations, European Conference on Computer Vision (ECCV), 2016.
  24. [Presented by Andrei A. on Mar. 29] PRESENTATION Ronghang Hu, Jacob Andreas, Trevor Darrell, Kate Saenko, Explainable Neural Computation via Stack Neural Module Networks, ECCV, 2018.
  25. [Presented by Shivang S. on Mar. 31] PRESENTATION Weixin Liang, James Zou, Zhou Yu, ALICE: Active Learning with Contrastive Natural Language Explanations, EMNLP 2020

Special Presentations

  1. April 5: Instructor Presentation, Explainable AI: Making Visual Question Answering Systems More Transparent
  2. April 7: Recorded Talk: Yonatan Bisk, "ALFRED -- A Simulated Playground for Connecting Language, Action, and Perception," video
  3. April 12: Recorded Talk: JingJing Liu, "Multimodal AI: Self-supervised Learning, Adversarial Training, and Vision+Language Inference," video
  4. April 14: Recorded Talk: Jeanette Bohg, "Leveraging Language in Learning Robot Manipulation Skills," video
  5. April 19: Guest Speaker: Peter Anderson (Google Austin)
  6. April 21: Guest Speaker: Chen Yu (UT Psychology), "Language learning in humans and machines: two sides of the same coin"
  7. April 26: Guest Speaker: David Harwath (UTCS), "Grounding as a Guide: Making Sense of Spoken Language," [PRESENTATION]

Class Project Presentations

  1. April 28: [Ojas Patel & Kiran Raja]
  2. May 3: [Andrei Amatuni & Yijin Zhao]
  3. May 3: [Ryo Kamoi & Jie Hao Liao]
  4. May 5: [Bill Yang & Willima Macke]
  5. May 5: [Shivang Singh & Daniel Almeraz]