Our research is broadly concerned with how the human brain processes and represents natural world. In particular, we want to understand how language is processed and represented by the cortex, and how those representations are grounded in other modalities.

The main method that we employ is the encoding model, a mathematical model that learns to predict how the brain will respond to new stimuli (such as language) based on large amounts of data. These models can then be tested for accuracy by checking how well they can predict responses in a new dataset that they were not trained on before. In general we train and test our encoding models using natural stimuli such as narrative stories or books. Together, encoding models and natural stimuli give a natural gradient along which research can progress: if we can build a model that is better able to predict the brain, then we have gained some understanding of how the brain works.

This work employs a wide variety of tools drawn from machine learning, natural language processing, applied mathematics, computer graphics, physics, and neuroscience. To stay on the cutting edge of these technologies, one focus of this lab is also to develop new tools and apply them to neuroscience problems. It is our long-term goal to use the neuroscience data and results to build better algorithms and smarter machines.
Language Semantics
How is the meaning of language represented in the human brain? In earlier work we explored representations of meaning at the scale of single words, revealing complex maps across association cortex. We now aim to use similar technology to explore how phrases and sentences, not just single words, are represented by activity in the human cortex.

Although our previous work used one of the largest language datasets ever collected on single subjects using fMRI, even that amount of data is much too small to test models of language understanding that incorporate the relationships between words, which give rise to context and compositionality. Thus, one of our major aims is to collect incredibly, unprecedentedly large datasets from single subjects. This will enable us to test large-scale nonlinear models that have never before been used to predict human fMRI data.
Grounded Language Representations
Language serves as a gateway to cognition. Words have the capacity to elicit incredibly complex cognitive processing in our heads, giving experimental access to many different aspects of cognition. Thus, by studying how the brain represents the meaning of language we can simultaneously explore how and where in the cortex many different cognitive processes function. We are particularly interested in the interplay between language and other modalities, such as spatial reasoning, visual processing, and somatosensory processing.