My group's research revolves around developing new methods for natural language processing. Deep learning and pre-trained models have achieved great success at fitting data and addressing problems with large in-domain training sets. But how can we apply these models to new domains and new settings with limited data? We approach this broad goal from many directions: robustness, so models work better in these new settings; controllability, to give users what they actually want; and interpretability, so we can understand and debug our systems' failure modes.
Summarization and Generation: A number of improvements are needed before the promises of broad-domain text generation systems can be truly realized. These models need to be controllable (Goyal and Durrett, ACL20) so we can change their behavior on-the-fly for target applications. They need to be learnable from small amounts of data (Desai, Xu, and Durrett, EMNLP20) so we can adapt them to data-scarce settings. They need to be factual with respect to their inputs (Goyal and Durrett, EMNLP-Findings20) so we can trust what they're saying. All of these goals are enabled by better understanding of these models' behavior (Xu, Desai, and Durrett, EMNLP20); understanding how decisions arise sequentially in generation models is critical for all of these aspects.
Question answering: Pre-trained models have achieved strong performance at a range of QA tasks. Although successful models do not necessarily need to reproduce human-like reasoning, answering questions in the right way can improve generalization and robustness. Models that produce discrete reasoning chains can be more effective even without explicit supervision of this reasoning process (Chen, Lin, and Durrett, arXiv19). Structuring the model to understanding the reasoning that leads to the answer can improve robustness to adversarial attacks and enable better calibrated QA models, particularly on new domains (Chen and Durrett, arXiv20).
Entity representation: Pre-trained models capture rich knowledge about entities, but also encode biases, can be misled by ambiguous entities, and may not handle rare entities well. One line of our work aims to make this knowledge explicit through building strong fine-grained entity typing models (Onoe and Durrett, NAACL19; Onoe and Durrett, arXiv21) and entity tracking models (Gupta and Durrett, EMNLP19). We showed that entity types can be directly used as distributed representations of for downstream tasks such as entity linking (Onoe and Durrett, AAAI20). One major benefit of this approach is that it is debuggable (Onoe and Durrett, EMNLP-Findings20): we can modify these representations on-the-fly in new domains with heuristics or other techniques to improve the performance of models that build on them.
Understanding datasets and model behavior: Sometimes, neural models can fit training sets and even achieve apparently good generalization to new examples, but not for the reasons we expect. Models that intuitively should be able to model a task well may simply not exploit the desired phenomena (Ferracane et al., ACL19; Mueller and Durrett, EMNLP18). The data itself may be to blame: the training set may have subtle surface patterns that our models learn instead of the true underlying phenomena (Chen and Durrett, NAACL19).