Vast amounts of human knowledge are available in unstructured text, but this information is largely inaccessible to computers. I am interested in solving core natural language processing problems so we can add structure to this text and better access that information. Our work focuses primarily on building structured statistical models for these kinds of problems, combining linguistic intuition with data-driven machine learning to make effective systems. I am interested in a wide range of NLP problems and techniques, including those I haven't had a chance to work on yet! A few current research interests of mine include:
Incorporating inductive biases into deep models: One advantage of deep neural network models is that they can be scaled up in capacity to fit extremely large datasets. However, on more modestly sized datasets, these models tend to overfit; put another way, they can learn to explain the data in many ways. Inductive bias helps us make sure these models learn to explain the data in the "right way" that will generalize to new examples. In ongoing work, we are exploring how techniques like data augmentation (Goyal and Durrett, ACL19), distant supervision (Onoe and Durrett, NAACL19), and latent structure models (Xu and Durrett, EMNLP18) can improve this without changing the underlying neural network architectures.
Building modular NLP systems with discrete structure: While end-to-end pre-trained models like BERT have achieved high performance on benchmark datasets, they may be hard for practitioners to understand or extend and may require significant tuning to work well. Complex NLP systems can benefit from modularity and abstraction, particularly when modules interact in terms of understandable discrete components. In past work, we showed how discrete parse structures and coreference information can help a summarization system (Durrett et al., ACL16). Our ongoing work explores various models that incorporate abstractions like entity types (Onoe and Durrett, NAACL19) to achieve better performance in low-data or cross-domain settings.
Understanding datasets and model behavior: Sometimes, neural models can fit training sets and even achieve apparently good generalization to new examples, but not for the reasons we expect. Models that intuitively should be able to model a task well may simply not exploit the desired phenomena (Ferracane et al., ACL19; Mueller and Durrett, EMNLP18). The data itself may be to blame: the training set may have subtle surface patterns that our models learn instead of the true underlying phenomena (Chen and Durrett, NAACL19).