NL4SE Reading Group

Natural Language for Software Engineering

Fall 2018 Meetings

We meet bi-weekly on Fridays 9:00am-10:00am in GDC 3.516.

Reading Schedule

Date Topic
9/20/2018 Comment Generation (Bibliography).

Previous Reading Schedule

Papers discussed prior to Spring 2018 can be found here.

Date Paper
8/01/2018 Arianna Blasi, Alberto Goffi, Konstantin Kuznetsov, Alessandra Gorla, Michael D. Ernst, Mauro Pezzè, and Sergio Delgado Castellanos. 2018. Translating Code Comments to Procedure Specifications. In ISSTA 2018, Proceedings of the 2018 International Symposium on Software Testing and Analysis, (Amsterdam, Netherlands).
7/25/2018 Pengcheng Yin, Bowen Deng, Edgar Chen, Bogdan Vasilescu, and Graham Neubig. 2018. Learning to Mine Aligned Code and Natural Language Pairs from Stack Overflow . International Conference on Mining Software Repositories (MSR). Gothenburg, Sweden.
7/18/2018 Vijayaraghavan Murali, Letao Qi, Swarat Chaudhuri, and Chris Jermaine. 2018. Neural Sketch Learning for Conditional Program Generation. In ICLR 2018.
7/04/2018 Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2018. Learning to Represent Programs with Graphs. In Proceedings of the International Conference on Learning Representations (ICLR).
6/27/2018 Miltiadis Allamanis, Earl T Barr, Premkumar Devanbu, and Charles Sutton. 2017. A Survey of Machine Learning for Big Code and Naturalness . arXiv preprint arXiv:1709.06182.
6/13/2018 NLP Final Project Report
5/30/2018 Alessandra Gorla, Ilaria Tavecchia, Florian Gross, and Andreas Zeller. 2014. Checking App Behavior Against App Descriptions. In Proceedings of the 36th International Conference on Software Engineering (pp. 1025-1035). ACM.
5/15/2018 NLP Final Project Reports
4/24/2018 Ziyu Yao, Daniel S. Weld, Wei-Peng Chen, and Huan Sun. 2018. StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow. In Proceedings of the 27th International Conference on World Wide Web (WWW 2018).
4/10/2018 Rahul Gupta, Soham Pal, Aditya Kanade, and Shirish K. Shevade. 2017. DeepFix: Fixing Common C Language Errors by Deep Learning. In AAAI, 2017.
3/27/2018 Xinyun Chen, Chang Liu, and Dawn Song. 2018. Tree-to-tree Neural Networks for Program Translation. arXiv:1802.03691 [cs.AI].
3/06/2018 Vincent J Hellendoorn and Premkumar Devanbu. 2017. Are Deep Neural Networks the Best Choice for Modeling Source Code?. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE). ACM, 763–773.
2/27/2018 Xi Victoria Lin, Chenglong Wang, Luke Zettlemoyer, Michael D. Ernst. 2018. NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System. LREC 2018.
2/13/2018 Pengcheng Yin and Graham Neubig. 2017. A Syntactic Neural Model for General-Purpose Code Generation. Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL) (2017).
1/30/2018 Pablo Loyola, Edison Marrese-Taylor, and Yutaka Matsuo. 2017. A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes. (ACL).

Proposed Papers

Xiaolong Li and Kristy Elizabeth Boyer. 2015. Semantic Grounding in Dialogue for Complex Problem Solving. Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics and Human Language Technology (NAACL HLT), 841-850.

Kelvin Guu, Panupong Pasupat, Evan Zheran Liu, and Percy Liang. 2017. From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood. In Association for Computational Linguistics (ACL).

Jeremy Lacomis, Alan Jaffe, Edward J. Schwartz, Claire Le Goues, and Bogdan Vasilescu. 2018. Statistical Machine Translation is a Natural Fit for Automatic Identifier Renaming in Software Source Code.

Zexuan Zhong, Jiaqi Guo, Wei Yang, Tao Xie, Jian-Guang Lou, Ting Liu, and Dongmei Zhang. 2018. Generating Regular Expressions from Natural Language Specifications: Are We There Yet?.

Osbert Bastani, Rahul Sharma, Alex Aiken, and Percy Liang. 2017. Synthesizing Program Input Grammars. Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, 2017.

Siyuan Jiang, Ameer Armaly, and Collin McMillan. 2017. Automatically Generating Commit Messages from Diffs using Neural Machine Translation. arXiv preprint arXiv:1708.09492 (2017).

Xi Victoria Lin, Chenglong Wang, Deric Pang, Kevin Vu, and Michael D Ernst. 2017. Program Synthesis from Natural Language Using Recurrent Neural Networks. Technical Report. Technical Report UW-CSE-17-03-01, University of Washington Department of Computer Science and Engineering, Seattle, WA, USA.

Maxim Rabinovich, Mitchell Stern, and Dan Klein. 2017. Abstract Syntax Networks for Code Generation and Semantic Parsing. Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL).

Antonio Valerio Miceli Barone and Rico Sennrich. 2017. A parallel corpus of Python functions and documentation strings for automated code documentation and code generation. arXiv preprint arXiv:1707.02275.

Illia Polosukhin and Alexander Skidanov. 2018. Neural Program Search: Solving Programming Tasks From Description and Examples. arXiv:1802.04335 [cs.AI].

Qiao Huang, Emad Shihab, Xin Xia, David Lo, and Shanping Li. 2018. Identifying self-admitted technical debt in open source projects using text mining. Empirical Software Engineering 23(1): 418-451 (2018).

Everton da S. Maldonado, Emad Shihab, and Nikolaos Tsantalis. 2017. Using Natural Language Processing to Automatically Detect Self-Admitted Technical Debt. IEEE Trans. Software Eng. 43(11): 1044-1062 (2017).

Mohammad Raza, Sumit Gulwani, and Natasa MilicFrayling. 2015. Compositional Program Synthesis from Natural Language and Examples. In Proceedings of IJCAI.

Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing Source Code using a Neural Attention Model. In Proceedings of ACL.

Alan Jaffe, Jeremy Lacomis, Edward J. Schwartz, Claire Le Goues, and Bogdan Vasilescu. 2018. Meaningful Variable Names for Decompiled Code: A Machine Translation Approach. In International Conference on Program Comprehension (2018).

Lin Tan, Ding Yuan, Gopal Krishna, and Yuanyuan Zhou. 2007. /* iComment: Bugs or Bad Comments? */. In ACM SIGOPS Symposium on Operating Systems Principles (SOSP), pages 145–158, 2007.

Jaroslav Fowkes, Razvan Ranca, Miltiadis Allamanis, Mirella Lapata, and Charles Sutton. 2014. Autofolding for Source Code Summarization. arXiv preprint arXiv:1403.4503 (2014).

Kyle Richardson, Jonathan Berant, and Jonas Kuhn. 2018. Polyglot Semantic Parsing in APIs. In Proceedings of NAACL.

Tao Yu, Zifan Li, Zilin Zhang, Rui Zhang, and Dragomir Radev. 2018. TypeSQL: Knowledge-based Type-Aware Neural Text-to-SQL Generation . In NAACL 2018.

Miltiadis Allamanis, Earl T Barr, Christian Bird, and Charles Sutton. 2015. Suggesting accurate method and class names . In Proceedings of the Joint Meeting of the European Software Engineering Conference and the Symposium on the Foundations of Software Engineering (ESEC/FSE), 2015.

Pengcheng Yin, Bowen Deng, Edgar Chen, Bogdan Vasilescu, and Graham Neubig. 2018. Learning to Mine Aligned Code and Natural Language Pairs from Stack Overflow . International Conference on Mining Software Repositories (MSR). Gothenburg, Sweden.

Miltiadis Allamanis, Hao Peng, and Charles Sutton. 2016. A Convolutional Attention Network for Extreme Summarization of Source Code . arXiv preprint arXiv:1602.03001.

Annie Louis, Santanu Kumar Dash, Earl T. Barr, and Charles Sutton. 2018. Deep Learning to Detect Redundant Method Comments. arXiv preprint arXiv:1806.04616.

Inderjot Kaur Ratol and Martin P. Robillard. 2017. Detecting Fragile Comments. In IEEE/ACM International Conference on Automated Software Engineering (ASE).

Matej Balog, Alexander L. Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2017. DeepCoder: Learning to Write Programs. In ICLR 2017.

Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2018. code2vec: Learning Distributed Representations of Code. arXiv preprint arXiv:1803.09473.

Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, and Sunghun Kim. 2016. Deep API Learning. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 631–642.

Xiaodong Gu, Hongyu Zhang, and Sunghun Kim. 2018. Deep Code Search. In Proceedings of the 40th International Conference on Software Engineering. ACM, 2018.

Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N. Nguyen. 2013. Boa: A Language and Infrastructure for Analyzing Ultra-Large-Scale Software Repositories. In Proceedings of the 2013 International Conference on Software Engineering, pages 422–431. IEEE Press, 2013.

Yuding Liang and Kenny Q. Zhu. 2018. Automatic Generation of Text Descriptive Comments for Code Blocks. In AAAI, 2018.


Please send suggestions for papers you would like to discuss to