Prior to joining UT Austin, I received my Ph.D. from UC Berkeley in 2016,
where I was advised by Dan Klein
and part of the Berkeley NLP Group.
- Compressive Summarization with Plausibility and Salience Modeling
Shrey Desai, Jiacheng Xu and Greg Durrett. EMNLP 2020.
- Evaluating Factuality in Generation with Dependency-level Entailment
Tanya Goyal and Greg Durrett. Findings of EMNLP 2020.
- Interpretable Entity Representations through Large-Scale Typing
Yasumasa Onoe and Greg Durrett. Findings of EMNLP 2020.
- Understanding Neural Abstractive Summarization Models via Uncertainty
Jiacheng Xu, Shrey Desai, and Greg Durrett. EMNLP 2020 (short).
- Calibration of Pre-trained Transformers
Shrey Desai and Greg Durrett. EMNLP 2020 (short).
- Byte Pair Encoding is Suboptimal for Language Model Pretraining
Kaj Bostrom and Greg Durrett. Findings of EMNLP 2020 (short).
- Inquisitive Question Generation for High Level Text Comprehension
Wei-Jen Ko, Te-Yuan Chen, Yiyan Huang, Greg Durrett and Junyi Jessy Li. EMNLP 2020.
- Sketch-Driven Regular Expression Generation from Natural Language and Examples
Xi Ye, Qiaochu Chen, Xinyu Wang, Isil Dillig and Greg Durrett. TACL 2020.
arxiv
- Neural Syntactic Preordering for Controlled Paraphrase Generation
Tanya Goyal and Greg Durrett. ACL 2020.
arxiv
- Benchmarking Multimodal Regex Synthesis with Complex Structures
Xi Ye, Qiaochu Chen, Isil Dillig, and Greg Durrett. ACL 2020.
arxiv
- Multi-Modal Synthesis of Regular Expressions
Qiaochu Chen, Xinyu Wang, Xi Ye, Greg Durrett, and Isil Dillig. PLDI 2020.
arxiv
- LambdaNet: Probabilistic Type Inference using Graph Neural Networks
Jiayi Wei, Maruth Goyal, Greg Durrett, and Isil Dillig. ICLR 2020.
OpenReview
- Fine-Grained Entity Typing for Domain Independent Entity Linking
Yasumasa Onoe and Greg Durrett. AAAI 2020.
arxiv code
- Neural Extractive Text Summarization with Syntactic Compression
Jiacheng Xu and Greg Durrett. EMNLP 2019.
arxiv
- Effective Use of Transformer Networks for Entity Tracking
Aditya Gupta and Greg Durrett. EMNLP 2019.
arxiv
- Query-focused Scenario Construction
Su Wang, Greg Durrett, and Katrin Erk. EMNLP 2019.
pdf
- Embedding time expressions for deep temporal ordering models
Tanya Goyal and Greg Durrett. ACL 2019 (short).
arxiv
- Evaluating Discourse in Structured Text Representations
Elisa Ferracane, Greg Durrett, Junyi Jessy Li, and Katrin Erk. ACL 2019 (short).
arxiv
- Learning to Denoise Distantly-Labeled Data for Entity Typing
Yasumasa Onoe and Greg Durrett. NAACL 2019.
pdf code
- Understanding dataset design choices for multi-hop reasoning
Jifan Chen and Greg Durrett. NAACL 2019 (short).
pdf
- Linguistically-Informed Specificity and Semantic Plausibility for Dialog Generation
Wei-Jen Ko, Greg Durrett, and Junyi Jessy Li. NAACL 2019.
pdf
- Tracking Discrete and Continuous Entity State for Process Understanding
Aditya Gupta and Greg Durrett. NAACL 2019 Workshop on Structured Prediction for NLP (oral).
pdf
- Domain Agnostic Real-Valued Specificity Prediction
Wei-Jen Ko, Greg Durrett, and Junyi Jessy Li. AAAI 2019.
pdf code
- Spherical Latent Spaces for Stable Variational Autoencoders
Jiacheng Xu and Greg Durrett. EMNLP 2018.
pdf code
- Effective Use of Context in Noisy Entity Linking
David Mueller and Greg Durrett. EMNLP 2018 (short).
pdf
- Picking Apart Story Salads
Su Wang, Eric Holgate, Greg Durrett, and Katrin Erk. EMNLP 2018.
pdf code and data
- Modeling Semantic Plausibility by Injecting World Knowledge
Su Wang, Greg Durrett, and Katrin Erk. NAACL 2018 (short).
pdf
- Identifying Products in Online Cybercrime Marketplaces: A Dataset for Fine-grained Domain Adaptation
Greg Durrett, Jonathan Kummerfeld, Taylor Berg-Kirkpatrick, Rebecca Portnoff, Sadia Afroz, Damon McCoy, Kirill Levchenko, and Vern Paxson. EMNLP 2017.
pdf BibTeX code data
- Tools for Automated Analysis of Cybercriminal Markets
Rebecca Portnoff, Sadia Afroz, Greg Durrett, Jonathan Kummerfeld, Taylor Berg-Kirkpatrick, Damon McCoy, Kirill Levchenko, and Vern Paxson. WWW 2017.
pdf BibTeX code data
- Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints
Greg Durrett, Taylor Berg-Kirkpatrick, and Dan Klein. ACL 2016.
pdf BibTeX system release poster
- Capturing Semantic Similarity for Entity Linking with Convolutional Neural Networks
Matthew Francis-Landau, Greg Durrett, and Dan Klein. NAACL 2016.
pdf BibTeX
- Neural CRF Parsing
Greg Durrett and Dan Klein. ACL 2015.
pdf synopsis BibTeX system release slides.key slides.pdf
Parsing algorithms like CKY are good at reckoning with discrete syntactic structures and neural networks are good at extracting nonlinear features from inputs. We show that neural networks can be used to score CFG rule productions in a continuous way while we can keep the grammar itself discrete.
- Disfluency Detection with a Semi-Markov Model and Prosodic Features
James Ferguson, Greg Durrett, and Dan Klein. NAACL 2015.
pdf synopsis BibTeX
When dealing with spoken language, identifying disfluencies like restarts ("I went–I used to go shopping") is an important prerequisite for figuring out what the speaker intended to say. We show that a semi-Markov conditional random field is well-suited to this task since it can make decisions about entire chunks of the sentence at once, rather than deciding in isolation whether or not each word is disfluent. Features targeting the speaker's prosody (directly in the speech signal) give further performance improvements.
- A Joint Model for Entity Analysis: Coreference, Typing, and Linking
Greg Durrett and Dan Klein. TACL 2014.
pdf synopsis BibTeX system release slides.key slides.pdf
Using a graphical modeling framework, we can unify component models for coreference resolution, semantic typing (i.e. named entity recognition), and entity linking to a knowledge base, improving performance on all three tasks by capturing cross-task interactions.
- Less Grammar, More Features
David Hall, Greg Durrett, and Dan Klein. ACL 2014.
pdf errata synopsis BibTeX system release
Traditional approaches to syntactic parsing have relied on heavily annotated context-free grammars which maintain a large amount of state during parsing. We show that it is possible to build a high-performance parser with a very simple grammar as long as CFG rule productions are scored in a rich way. Our scoring function inspects the words in the sentence that are dominated by the rule in question; this lets us use a simpler grammar because we're looking at important lexical information directly rather than trying to thread it through CFG states.
The paper posted here has updated sentiment analysis results compared to the version in the ACL anthology. In the original version of this paper, we trained and evaluated our system on a slightly different dataset from that of Socher et al. (2013) due to a misunderstanding of their evaluation condition. However, the general conclusions about our system's performance relative to theirs are unchanged.
- Easy Victories and Uphill Battles in Coreference Resolution
Greg Durrett and Dan Klein. EMNLP 2013. Best Paper Finalist
pdf synopsis BibTeX system release slides.key slides.pdf
We present a simple model for coreference resolution that primarily exploits surface textual properties to determine if two mentions (like "Barack Obama" and "the president") refer to the same entity. Features in our discriminative model targeting these surface properties (including, for example, the first and last words in an entity reference) manage to capture a wide range of linguistic phenomena important for coreference, such as pronoun agreement and discourse structure. Crucially, our surface features can also pick up on other "extra-linguistic" patterns in the data. As a result, our model outperforms previous work that only uses hand-coded heuristics to target linguistic properties of interest.
- Decentralized Entity-Level Modeling for Coreference Resolution
Greg Durrett, David Hall, and Dan Klein. ACL 2013.
pdf synopsis BibTeX slides.key slides.pdf
We present a coreference resolution model that can efficiently represent and infer semantic properties associated with entities. During inference, we propagate semantic information along coreference arcs in a probabilistic way, allowing us to discern that "Clinton" is female if that mention's antecedent is "Hillary Clinton."
- Unsupervised Transcription of Historical Documents
Taylor Berg-Kirkpatrick, Greg Durrett, and Dan Klein. ACL 2013.
pdf synopsis BibTeX system release
We present a system for optical character recognition on historical documents produced by printing presses. For each document, we learn (with no supervision) the properties of that document including the specific font used and the darkness with which characters were printed, leading to greater recognition accuracy. Our system dramatically outperforms commercial OCR tools on historical documents, making around half as many errors at the word level.
- Supervised Learning of Complete Morphological Paradigms
Greg Durrett and John DeNero. NAACL 2013.
pdf errata synopsis BibTeX dataset code slides.key slides.pdf
We propose a model that can learn how to inflect words (e.g. conjugate verbs, decline nouns) based on labeled examples from Wiktionary. We extract rules that describe how part of a word like the prefix or suffix changes for each possible setting of morphological attributes (person, singular vs. plural, etc.), then learn how to predict which change rules apply to words we've never seen before.
The paper posted here differs slightly from the one in the ACL anthology. After submitting the camera-ready copy, we discovered that our Finnish nouns dataset also contained adjectives, due to incorrect assumptions we made when scraping Wiktionary. The paper and accompanying dataset are updated to reflect that our evaluation condition is over both Finnish nouns and adjectives simultaneously. Beyond this modification, the results and their interpretation are unchanged.
- Syntactic Transfer Using a Bilingual Lexicon
Greg Durrett, Adam Pauls, and Dan Klein. EMNLP 2012.
pdf synopsis BibTeX slides.pdf
We collect information about the syntactic behavior of English words using a large corpus, then show that this information can be ported to other languages by translating the words using a bilingual dictionary. Having this information helps improve performance of a dependency parser for resource-poor languages where large labeled treebanks are not be available.
- An Empirical Investigation of Discounting in Cross-Domain Language Models
Greg Durrett and Dan Klein. ACL 2011 (short).
pdf synopsis BibTeX slides.pdf
We modify a standard technique for estimating n-gram language models to give better performance when those language models are applied to text from other domains. We find this is useful even for domains that only differ very slightly.
Show thesis
Show undergrad research