Vast amounts of human knowledge are available in text, but when that text is unstructured, this information is largely inaccessible from the standpoint of computers. I am interested in solving core natural language processing problems so we can add structure to this text and better access that information. For example, a syntactic parser can identify verbs and their arguments, which tell us about events and their participants. A coreference resolution system then identifies expressions referring to the same entity to build a more complete picture of the events being described.
My work has focused primarily on building structured statistical models for these kinds of problems, combining linguistic intuition with data-driven machine learning to make effective systems. I am interested in a wide range of NLP problems and techniques, including those I haven't had a chance to work on yet! A few current research interests of mine include:
Combining deep learning and discrete structures: Deep neural networks have recently proven effective at sequence transduction problems, but they are even more powerful when they incorporate discrete mechanisms like attention (soft alignment to the input data). There are good opportunities to use even more explicit structure. For generating longer texts like summaries, discrete variables tracking entity references can help capture pragmatic phenomena, and coverage models that explicitly track what aspects of the input have already been discussed can also impose a useful discourse structure. Giving a model the right inductive biases in the form of these structures would let us learn better models from smaller amounts of data.
Integrating NLP components to make holistic systems: Models for analyzing text can only work so well operating in isolation, and often stumble when it comes to robustly integrating world knowledge from knowledge bases like Wikipedia. For example, current coreference systems might know that France is a country, but it is much harder to support complex logical inferences: a country is the same thing as a nation which also has a government, which might be referred to with country's name in certain settings (France opposed the UN resolution). Capturing all of this requires drawing on multiple knowledge sources, using information from unstructured text that might not be explicitly represented in any knowledge base, and making context-dependent inferences.
Understanding structures arising from deep models: Advances in NLP often arise from better understanding of how statistical models operate. In syntactic parsing, for example, simplification and analysis of lexicalized models led to more robust and effective kinds of grammar annotation. For deep models like sequence-to-sequence LSTMs, we have little introspection into the structures of these vector spaces beyond some very basic geometry: e.g., in a recurrent POS tagger, hidden states at indices in the sentence with similar parts-of-speech might be expected to be close by. If we know what phenomena are being captured by these models through better intrinsic analysis, that will guide efforts to develop better models down the road.