AI Colloquia - Tamara Berg/Stony Brook University, "Learning from Descriptive Text," ACES 2.302

Contact Name: 
Ray Mooney and Kristen Grauman
ACES 2.302
Nov 30, 2012 11:00am - 12:00pm

Signup Schedule:

Type of Talk: AI Colloquia

Speaker/Affiliation: Tamara Berg/Stony Brook University

Talk Audience: Faculty, Grads, Undergrads, and Outside Interested Parties

Date/Time: 11/30/2012, 11:00 AM to 12:00 PM

Location: ACES 2.302

Hosts: Ray Mooney and Kristen Grauman

Talk Title: "Learning from Descriptive Text"

Talk Abstract: People communicate using language, whether spoken, written, or typed. A significant amount of this language describes the world around us, especially the visual world in an environment, or depicted in images or video. In addition there exist billions of photographs with associated text available on the web examples include web pages, captioned or tagged photographs, and video with speech or closed captioning. Such visually descriptive language is potentially a rich source of 1) information about the world, especially the visual world, 2) training data for how people construct natural language to describe imagery, and 3) guidance for where computational visual recognition algorithms should focus efforts. In this talk the Tamara Berg will describe several projects related to images and descriptive text, including recent approaches to automatically generating natural language describing images, their newly released collection of 1 million images with captions, and explorations of how visual content relates to what people find important in images.

All papers, created datasets, and demos are available on the speaker's webpage at:

Speaker Bio: Tamara Berg received her B.S. in Mathematics and Computer Science from the University of Wisconsin, Madison in 2001. She then completed a PhD from the University of California, Berkeley in 2007 and spent 1 year as a research scientist at Yahoo! Research. She is currently an Assistant Professor in the computer science department at Stony Brook University and a core member of the consortium for Digital Art, Culture, and Technology (cDACT). Her research straddles the boundary between Computer Vision and Natural Language Processing with applications to large scale recognition and multimedia retrieval.