Project extended outlines due Wed Oct
31. See handout
Topics: This is a
graduate seminar course in computer vision. We
will survey and discuss current vision papers relating to
object and activity recognition, auto-annotation of images,
and scene understanding. The goals of the course will
be to understand current approaches to some important
problems, to actively analyze their strengths and
weaknesses, and to identify interesting open questions and
possible directions for future research.
See the syllabus for an outline of
the main topics we'll be covering.
will be responsible for writing paper reviews each week,
participating in discussions, completing two programming
assignments, presenting once or twice in class (depending on
enrollment), and completing a project (done in pairs).
Note that presentations
are dueone week before
the slot your presentation is scheduled. This means
you will need to read the papers, prepare experiments,
create slides, etc. more than one week before the date you
are signed up for. The idea is to meet and discuss
ahead of time, so that we can iterate as needed the week
leading up to your presentation.
More details on
the requirements and grading breakdown are here.
in computer vision and/or machine learning (378/376 Computer
Vision and/or 391 Machine Learning, or similar); ability to
understand and analyze conference papers in this area;
programming required for experiment presentations and
talk to me if you are unsure if the course is a good match
for your background. I generally recommend scanning
through a few papers on the syllabus to gauge what kind of
background is expected. I don't assume you are already
familiar with every single algorithm/tool/image feature a
given paper mentions, but you should feel comfortable
following the key ideas.
Note: * = required reading.
Additional papers are provided for reference, and as a starting
point for background reading for projects. Paper presentations:
Cover the starred papers. Experiment presentations:
Pick one from among the starred papers.
*Selected pages from: Local Invariant
Feature Detectors: A Survey, Tuytelaars and
Mikolajczyk. Foundations and Trends in
Computer Graphics and Vision, 2008. [pdf]
pp. 178-188, 216-220, 254-255]
*Video Google: A Text Retrieval
Approach to Object Matching in Videos, Sivic and
Zisserman, ICCV 2003. [pdf]
retrieval algorithms, mining for visual themes,
particularly for object instances
*Total Recall: Automatic Query
Expansion with a Generative Feature Model for Object
Retrieval. O. Chum et al. CVPR 2007. [pdf]
*Discovering Favorite Views of Popular Places
with Iconoid Shift. T. Weyand and B.
Leibe. ICCV 2011. [pdf]
*Supervised Hashing with Kernels. W.
Liu, J. Wang, R. Ji, Y. Jiang, S.-F. Chang.
CVPR 2012 [pdf]
Kernelized Locality Sensitive Hashing
for Scalable Image Search, by B. Kulis and K.
Grauman, ICCV 2009 [pdf]
Tiny Images data]
Computing and Exploiting Connectivity in Image
Collections. K. Heath, N. Gelfand, M.
Ovsjanikov, M. Aanjaneya, and L. Guibas. CVPR
World-scale Mining of
Objects and Events from Community Photo
Collections. T. Quack, B. Leibe, and L. Van
Gool. CIVR 2008. [pdf]
Total Recall II: Query
Expansion Revisited. O. Chum, A. Mikulik, M.
Perdoch, and J. Matas. CVPR 2011. [pdf]
Finding a (Thick) Needle in a Haystack, O. Chum, M.
Perdoch, and J. Matas. CVPR 2009. [pdf]
Three Things Everyone
Should Know to Improve Object Retrieval. R.
Arandjelovic and A. Zisserman. CVPR
Mining with Frequent Itemset Configurations.
T. Quack, V. Ferrari, and L. Van Gool. CIVR
Bundling Features for
Large Scale Partial-Duplicate Web Image
Search. Z. Wu, Q. Ke, M. Isard, and J.
Sun. CVPR 2009. [pdf]
Localization by Active Correspondence Search. T.
Sattler, B. Leibe, L. Kobbelt. ECCV
Learning Binary Projections
for Large-Scale Image Search. K.
Grauman and R. Fergus. Chapter
to appear in Registration, Recognition, and Video
Analysis, R. Cipolla, S. Battiato, and G.
Farinella, Editors. [pdf]
Preﬁlters for Scalable Image Retrieval. L.
Torresani, M. Szummer, and A. Fitzgibbon.
CVPR 2009. [pdf]
Detecting Objects in
Large Image Collections and Videos by Efficient
Subimage Retrieval, C. Lampert, ICCV 2009. [pdf]
Searching for Similar Images.K.
the ACM, 2009.[CACM
Fast Image Search for
Learned Metrics, P. Jain, B. Kulis, and K. Grauman,
CVPR 2008. [pdf]
Small Codes and Large
Image Databases for Recognition, A. Torralba, R.
Fergus, and Y. Weiss, CVPR 2008. [pdf]
Object Retrieval with Large Vocabularies and
Fast Spatial Matching. J. Philbin, O. Chum, M.
Isard, J. Sivic, and A. Zisserman, CVPR 2007.
Location Recognition, G. Schindler, M. Brown, and R.
Szeliski, CVPR 2007. [pdf]
Sharing features between classes, transfer, taxonomy,
learning from few examples, exploiting class relationships
*Sharing Visual Features for
Multiclass and Multiview Object Detection, A.
Torralba, K. Murphy, W. Freeman, PAMI 2007. [pdf]
*Hedging Your Bets: Optimizing
Accuracy-Speciﬁcity Trade-offs in Large Scale Visual
Recognition. J. Deng, J. Krause, A. Berg, L.
Fei-Fei. CVPR 2012 [pdf]
Model Transfer for Object Category Detection. Y.
Atar and A. Zisserman. CVPR 2011. [pdf]
What Does Classifying More than 10,000 Image
Categories Tell Us? J.
Deng, A. Berg, K. Li and L. Fei-Fei. ECCV
Relaxed Hierarchy for Large-scale Visual
Recognition. T. Gao and Daphne Koller.
ICCV 2011. [pdf]
Comparative Object Similarity for Improved
Recognition with Few or Zero Examples. G.
Wang, D. Forsyth, and D. Hoeim. CVPR
Using Taxonomies for Fast Visual Categorization, G.
Griffin and P. Perona, CVPR 2008. [pdf]
80 Million Tiny Images: A Large
Dataset for Non-Parametric Object and Scene
Recognition, by A. Torralba, R. Fergus, and W.
Freeman. PAMI 2008. [pdf]
Constructing Category Hierarchies for Visual
Recognition, M. Marszalek and C. Schmid. ECCV
Visual Models from Few Training Examples: an
Incremental Bayesian Approach Tested on 101 Object
Categories. L. Fei-Fei, R. Fergus, and P.
Perona. CVPR Workshop on Generative-Model
Based Vision. 2004. [pdf]
Towards Scalable Representations of
Object Categories: Learning a Hierarchy of Parts. S. Fidler and A. Leonardis. CVPR 2007 [pdf]
A. Zweig and D. Weinshall, ICCV
Learning of Object Detectors Using a Visual Shape
Alphabet. Opelt, Pinz, and Zisserman, CVPR
Image Database, J. Deng, W. Dong, R. Socher,
L.-J. Li, K. Li and L. Fei-Fei, CVPR 2009 [pdf]
Sharing for Learning with Many Categories. R.
Fergus et al. ECCV 2010. [pdf]
Learning a Tree of Metrics with Disjoint
Visual Features. S. J. Hwang, K. Grauman, F.
Sha. NIPS 2011.