UTCS Colloquia/AI: Kristen Grauman/UT-Austin, UTCS: "Efficient Visual Search and Learning" ACES 2.402, Friday, Feb. 27, 2009 11:00 a.m.

Contact Name: 
Jenna Whitney
Feb 27, 2009 11:00am - 12:00pm

Type of Talk:  UTCS Colloquia/AI


tion:  Kristen Grauman/University of Texas-Austin, UTCS


ime:  Friday, February 27, 2009  11:00 a.m.


bsp; ACES 2.402

Host:  Ray Mooney

Talk Title:  &qu

ot;Efficient Visual Search and Learning"

Talk Abstract:

Image and video data are rich with meaning, memories, or entertainment,

and they can even facilitate communication or scientific discovery. However

, our ability to capture and store massive amounts of interesting visual d

ata has outpaced our ability to analyze it. Methods to search and organize

images directly based on their visual cues are thus necessary to make them

fully accessible. Unfortunately, the complexity of the problem often leads
to approaches that will not scale: conventional methods rely on substantia

l manually annotated training data, or have such high computational costs

that the representation or data sources must be artificially restricted. In
this talk I will present our work addressing scalable image search and rec

ognition. I will focus on our techniques for fast image matching and retrie

val, and introduce an active learning strategy that minimizes the annotati

ons that a human supervisor must provide to produce accurate models.

br />While generic distance functions are often used to compare image featu

res, we can use a sparse set of similarity constraints to learn metrics th

at better reflect their underlying relationships. To allow sub-linear time

similarity search under the learned metrics, we show how to encode the met

ric parameterization into randomized locality-sensitive hash functions. Our
learned metrics improve accuracy relative to commonly-used metric baseline

s, while our hashing construction enables efficient indexing with learned

distances and very large databases. In order to best leverage manual interv

ention, we show how the system itself can actively choose its desired anno

tations. Unlike previous work, our approach accounts for the fact that the
optimal use of manual annotation may call for a combination of labels at m

ultiple levels of granularity (e.g., a full segmentation on some images an

d a present/absent flag on others). I will provide results illustrating how
these efficient strategies will enable a new class of applications that re

ly on the analysis of large-scale visual data, such as object recognition

, activity discovery, or meta-data labeling.

Speaker Bio:


isten Grauman is a Clare Boothe Luce Assistant Professor in the Department

of Computer Sciences at the University of Texas at Austin. Before joining U

T in 2007, she received the Ph.D. and S.M. degrees from the MIT Computer S

cience and Artificial Intelligence Laboratory. Her research in computer vis

ion and machine learning focuses on visual search and recognition. She is a
Microsoft Research New Faculty Fellow, and a recipient of an NSF CAREER a

ward and the Frederick A. Howes Scholar Award in Computational Science.