CS395T: Visual Recognition and Search

Spring 2009

 

Thursdays 3:30-6:30 pm

TAY 3.144

Unique # 54425

 

Instructor: Kristen Grauman

TA: Harshdeep Singh

 

Office hours: by appointment, CSA 114 (modular building near ENS)

 

 

 


Overview    Requirements    Books    Links   Blackboard   Syllabus   Schedule

 

 

 


Announcements:

 

Reading for next week:

 

Autotagging Facebook: Social Network Context Improves Photo Annotation, by  Z. Stone, T. Zickler, and T. Darrell.  Internet Vision Workshop 2007. 

[pdf]

 

Learning Tag Relevance by Neighbor Voting for Social Image Retrieval, by X. Li, C. Snoek, and M. Worring.  MIR 2008. 

[pdf]

 

Why We Tag: Motivations for Annotation in Mobile and Online Media, by M. Ames and M. Naaman, CHI 2007. 

[pdf]

 

 

Overview:

 

This is a graduate seminar course in computer vision.   We will survey and discuss current vision papers relating to object recognition and content-based retrieval for images and videos.  The goals of the course will be to understand current approaches to some important problems, to actively analyze their strengths and weaknesses, and to identify interesting open questions and possible directions for future research. 

 

 

Topics will include:

 

        recognition models for objects

        image/video search and the web

        fast indexing methods

        the image annotation process

        holistic scene recognition

        considering language (text) with visual cues

        the role of context in recognition

        unsupervised and semi-supervised learning from images

 

See the syllabus and list of selected papers for more details.

 

Students will be responsible for writing paper reviews each week, participating in discussions, presenting a paper and demo, and completing a project (done in pairs).  More details are below.

 

 

Prerequisites:

 

Courses in computer vision and/or machine learning.

 

Ability to understand and do a high-level analysis of conference papers in this area. 

 

Please talk to me if you are uncertain if this course will be a good match for your background.

 

 

Course requirements:

 

Students are expected to do the assigned reading, participate in class discussions, write two paper reviews each week, and complete a final project.  In addition, everyone will be responsible for giving two presentations: one that involves doing background research on a topic (using 2-3 papers from the provided list), and one that involves an experimental demo relevant to one of the topics.  The two presentations should be on different topics.  Details on each of these elements are provided here.

 

 

Grading policy:

 

Grades in the class will be determined as follows:

 

        20% Participation (including attendance, in-class discussions, paper reviews)

        20% Paper presentation

        20% Demo presentation

        40% Final project (including proposal, progress report, and final paper)

 

Please read the UTCS code of conduct.

 

 

Important dates:

 

March 19: Spring break, no class

March 26: Project proposals due

April 16 April 23: Project progress reports / drafts due

May 7: Final project papers due

May 7 and 8: Final project presentations 3:30 pm 6: 30 pm [Note unusual date, Friday the 8th]

 

 

Books:

 

There is no required textbook for this course, as we will get most of our content from the papers we read.  However, you may find these books useful references.  They are on reserve at the PCL library.

 

        Computer vision : a modern approach, David A. Forsyth and Jean Ponce.

 

        Computer vision, Linda G. Shapiro and George C. Stockman.

 

        Introductory techniques for 3-D computer vision, Emanuele Trucco and Alessandro Verri.

 

        Computer vision, Dana H. Ballard and Christopher M. Brown.  (available online)

 

        Multiple view geometry in computer vision, Richard Hartley and Andrew Zisserman.

 

        Pattern classification, Richard O. Duda, Peter E. Hart, and David G. Stork.

 

        Machine learning, Tom M. Mitchell.

 

 

 

Links:

 

        Compiled list of recognition datasets

 

        OpenCV (open source computer vision library)

 

        Weka (Java data mining software)

 

        Netlab (Matlab toolbox for data analysis techniques, written by Ian Nabney and Christopher Bishop)

 

        CV Online

 

        Annotated Computer Vision Bibliography

 

        Computer vision conferences

 

        ICCV 2005 / CVPR 2007 Short Course on Recognition

 

        AAAI 2008 Tutorial on Recognition

 

        ICML 2008 Tutorial on Recognition

 

 

Related courses:

 

Past semesters at UT:

        CS 395T Spring 2007: Object Recognition

        CS 395T Spring 2008: Visual Recognition and Search

 

Elsewhere:

        6.870 Object Recognition and Scene Understanding, MIT, Antonio Torralba

        16-721 Learning-based Methods in Vision, CMU, Alyosha Efros

        252C Selected Topics in Vision & Learning, UCSD, Serge Belongie

        CMPT882: Recognition Problems in Computer Vision, SFU, Greg Mori

        CS 598: High-Level Recognition in Computer Vision, Princeton, Fei-Fei Li