UTCS Faculty Candidate - Ali Farhadi/Institute at Carnegie Mellon University, "Toward Richer Visual Recognition: Attributes, Visual Phrases, and Sentences", ACES 2.302

Contact Name: 
Jenna Whitney
Apr 24, 2012 11:00am - 12:00pm

There is a sign-up schedule for this event that can be found at

Type of

Talk: UTCS Faculty Candidate

Speaker/Affiliation: Ali Farhadi/Institut

e at Carnegie Mellon University

Talk Audience: UTCS Faculty, Graduate
Students, Undergraduate Students and Outside Interested Parties


/Time: Tuesday, April 24, 2012, 11:00 am

Location: ACES 2.302

Host: Don Fussell

Talk Title: Toward Richer Visual Recognition: Attrib

utes, Visual Phrases, and Sentences

Talk Abstract:
What does it me

an to do object recognition? My ultimate goal is to have a machine generate
a human-quality description of images. Humans can form complete sentences

describing images. These sentences identify the most interesting objects,

the actions that are being performed, and the scene where the action occur

s. Emulating this skill demands answers to fundamental questions about reco

gnition: how can a recognition system deal with the vast number of objects

in the real world? what should a recognition system report when it sees an

unfamiliar object? what are the right quanta of recognition? In this talk,
I will explore novel representations that try to answer these questions. F

irst, I will describe the notion of "visual attributes" and show the benef

its of adopting an attribute-centric framework in cross category generaliza

tion and in providing richer image descriptions. I will also introduce "vis

ual phrases" chunks of meanings bigger than objects but smaller than scenes

. Finally, I will show that using visual phrases significantly improves th

e performance of current recognition systems.

Speaker Bio:
Ali Far

hadi is a Postdoctoral Fellow at the Robotics Institute at Carnegie Mellon

University working with Martial Hebert and Alexei Efros. He received his Ph

D from the computer science department at the University of Illinois at Urb

ana-Champaign under the supervision of David Forsyth. His work is mainly fo

cused on computer vision and machine learning. More specifically, he is i

nterested in cross-category generalization, attribute-based object represe

ntations, deeper image understanding, transfer learning and its applicati

ons to human activity and object recognition. Ali has been awarded the inau

gural Google Fellowship in computer vision and image interpretation, the C

.W. Gear Outstanding Graduate Award, the University of Illinois CS Fellows

hip, Beckman CS/AI Award, and the CVPR11 Best Student Paper Award.