CS 395T: Object Recognition

Spring 2007


Announcements       Overview     Course requirements         Schedule and papers          Books           Useful links



Tues/Thurs 12:30 – 2:00 pm in ACES 3.408 (note new location)

Unique #55226


Instructor: Kristen Grauman 

Email: grauman -at- cs.utexas.edu

Office hours: Tues/Thurs 2:00-3:00 pm in TAY 4.118



·        Project papers are due Friday May 4. Guidelines for writing them are here.

·        5 minute project presentations are on May 1 and May 3.  The schedule is here.

·        Proposal guidelines and project ideas are here (pdf).

·        Reading assignments are as listed on the schedule.




The goal of computer vision is to develop the algorithms and representations that will enable a machine to autonomously analyze visual information.  As such, object recognition is a fundamental vision problem: put simply, what’s in the picture, and where?  Recognition remains challenging in large part due to the significant variations exhibited by real-world images.  Partial occlusions, viewpoint changes, varying illumination, cluttered backgrounds, and intra-category appearance variations all make it necessary to develop exceedingly robust models of categories. 


In this course we will survey and discuss current computer vision literature on object and category recognition.  The goals of the course will be to understand current approaches to some important problems in visual recognition, to actively analyze their strengths and weaknesses, and to begin to identify interesting open questions and possible directions for future research.  Topics will include part-based models for recognition, invariant local features, bags of features, local spatial constraints, shape descriptors and matching, learning similarity measures, fast indexing methods, recognition with text and images, the role of context in recognition, and unsupervised category discovery. 


For each given sub-topic, our discussion and class presentations will center around a few selected relevant research papers.  Our study of state-of-the-art topics in recognition will lead up to research-oriented final course projects.





There are no rigid prerequisites to participate in this course, aside from an interest in computer vision.  
Any previous exposure to computer vision, machine learning, applied probability, and/or image processing will be an asset.  Please feel free to contact me if you have any concerns about whether or not you should take this course.



Course expectations:


Discussions, paper reviews, and presentations


The quality of our discussions will rely significantly on how prepared everyone is when they come to class.  Students are expected to keep up with the readings so that they may actively participate in our discussions.  To assist in this preparation, before coming to class students will be required to submit a short review on some portion of the current reading material and to prepare a few questions they would like to pose to the class about the research.  General guidelines for writing your paper reviews are here.


For each topic/session, two to three students from the class will also be responsible for 1) giving us a concise, well-prepared presentation on a selected paper and 2) preparing an in-class “demo” that is relevant to the readings.  I can provide feedback on your planned presentations if you meet with me (or email me slides) a few days before you are scheduled to present, although this is not required.  The number of days each student presents will depend on the class size.  Details will be discussed the first week of class.  General guidelines for preparing a paper presentation or demo are here.




As part of this course, students will complete research-oriented projects.  A good project could be built around any of the following:

·        an extension to one of the techniques studied in class

·        an in-depth empirical evaluation and analysis of a few related techniques

·        design of a novel approach and accompanying experiments


Project proposals will be due in the middle of the term.  I encourage you to define your own project; however, I am also happy to suggest potential project ideas.  At the end of the term we will reserve time to present and discuss each project in class.



Grades in the class will be determined roughly as follows:

·        30%: Class participation and regular paper reviews

·        30%: Class presentation(s) and demo(s)

·        40%: Final project proposal, paper, and presentation



Please read the UTCS code of conduct.





The list of topics and papers we will cover is here.  Please note, the details of our schedule are subject to change in the event that we need more time for a given topic.




            There is no required textbook for this course, as we will get most of our content from the papers we read.

            However you may find these books useful references:

·        Forsyth and Ponce, Computer Vision: A Modern Approach.

·        Duda, Hart, and Stork.  Pattern Classification (2nd Edition)

            Copies are on reserve for our class at the library (PCL).


Useful links:


·        CV Online

·        OpenCV (open source computer vision library)

·        Weka (Java data mining software)

·        Object recognition databases (list compiled by Kevin Murphy)

·        Various useful databases and image sources (list compiled by Alyosha Efros)

·        Netlab (matlab toolbox for data analysis techniques, written by Ian Nabney and Christopher Bishop)

·        Oxford Visual Geometry Group (contains links to data sets and feature extraction software)

·        Computer vision conferences

·        Annotated computer vision bibliography

·        Face recognition homepage

·        Computer vision research groups