Vast amounts of human knowledge are available in text, but when that text is unstructured, this information is largely inaccessible from the standpoint of computers. I am interested in solving core natural language processing problems so we can add structure to this text and better access that information. My work has focused primarily on building structured statistical models for these kinds of problems, combining linguistic intuition with data-driven machine learning to make effective systems. I am interested in a wide range of NLP problems and techniques, including those I haven't had a chance to work on yet! A few current research interests of mine include:
Incorporating inductive biases into deep models: One advantage of deep neural network models is that they can be scaled up in capacity to fit extremely large datasets. However, on more modestly sized datasets, these models tend to overfit; put another way, they can learn to explain the data in many ways. Inductive bias helps us make sure these models learn to explain the data in the "right way" that will generalize to new examples. In ongoing work, we are exploring how constraining attention mechanisms, imposing structure on latent spaces of autoencoders, and strengthening the supervision in mathematical reasoning systems can all improve the performance of neural models, without changing the underlying neural network architectures. Such developments will help us ensure that sophisticated models actually make good use of the provided training data rather than overfitting on surface characteristics.
Combining strong models and discrete structures: Building explicit structure into machine learned models is another way of leveraging linguistic knowledge to improve our systems. In past work, we showed how discrete parse structures and coreference information can help a summarization system (ACL 2016). Our ongoing work explores incorporating linguistic features into neural network models of coreference resolution and entity linking: in tasks like these, neural networks do not always learn the right structure from data, particularly when transferring to new domains, and explicitly encoding the right abstractions can be effective.
Integrating NLP components to make holistic systems: Models for analyzing text can only work so well operating in isolation, and often stumble when it comes to robustly integrating world knowledge from knowledge bases like Wikipedia. For example, current coreference systems might know that France is a country, but it is much harder to support complex logical inferences: a country is the same thing as a nation which also has a government, which might be referred to with country's name in certain settings (France opposed the UN resolution). Capturing all of this requires drawing on multiple knowledge sources, using information from unstructured text that might not be explicitly represented in any knowledge base, and making context-dependent inferences.