II. Surrounding cues
Rapid Object Detection Using a Boosted Cascade of Simple Features, by P. Viola and M. Jones. CVPR 2001.
Histograms of Oriented Gradients for Human Detection, by N.Dalal, B.Triggs. CVPR 2005
Additional code / software:
Object Recognition from Local Scale-Invariant Features, by D. Lowe. ICCV 1999.
Local Invariant Feature Detectors: A Survey, by T. Tuytelaars and K. Mikolajczyk. Foundations and Trends in Computer Graphics and Vision, 2008.
Sampling Strategies for Bag-of-Features Image Classification. E. Nowak, F. Jurie, and B. Triggs. ECCV 2006.
Groups of Adjacent Contour Segments for Object Detection, by V. Ferrari, L. Fevrier, F. Jurie, and C. Schmid. PAMI 2007.
Normalized Cuts for Image Segmentation, by J. Shi and J. Malik. CVPR 1997.
Shape Matching and Object Recognition Using Shape Contexts, by S. Belongie, J. Malik, and J. Puzicha. PAMI April 2002.
Ivan Laptev’s software for space-time
interest points and histograms of oriented gradients (HOG) and histograms of
optical flow (HOF)
Berkeley Group boundary detection code from David Martin
Graph-based segmentation code from Pedro Felzenszwalb
The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features, by K. Grauman and T. Darrell. ICCV 2005.
Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification, by A. Frome, Y. Singer, F. Sha, J. Malik. ICCV 2007.
Video Google: A Text Retrieval Approach to Object Matching in Videos, by J. Sivic and A. Zisserman, ICCV 2003.
Proximity Distribution Kernels for Geometric Context in Category Recognition, by H. Ling and S. Soatto. CVPR 2007.
Object Class Recognition by Unsupervised Scale Invariant Learning, by R. Fergus, P. Perona, and A. Zisserman. CVPR 2003.
Combined Object Categorization and Segmentation with an Implicit Shape Model, by B. Leibe, A. Leonardis, and B. Schiele. ECCV Workshop on Statistical Learning in Computer Vision, 2004.
A Discriminatively Trained, Multiscale, Deformable Part Model, by P. Felzenszwalb, D. McAllester and D. Ramanan. CVPR 2008.
LabelMe: a Database and Web-based Tool for Image Annotation. B. Russell, A. Torralba, K. Murphy, and W. Freeman, IJCV 2008.
Peekaboom: A Game for Locating Objects in Images, by L. von Ahn, R. Liu and M. Blum, CHI 2006.
GrabCut -Interactive Foreground Extraction using Iterated Graph Cuts, by C. Rother, V. Kolmogorov, A. Blake, SIGGRAPH 2004.
Multi-Level Active Prediction of Useful Image Annotations for Recognition, by S. Vijayanarasimhan and K. Grauman, NIPS 2008.
Geometric Context from a Single Image, by D. Hoiem, A. Efros, and M. Hebert, ICCV 2005.
Depth Estimation using Monocular and Stereo Cues, by A. Saxena, J. Schulte, and A. Ng. IJCAI 2007.
Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope, by A. Oliva and A. Torralba, IJCV 2001.
A Bayesian Hierarchical Model for Learning Natural Scene Categories, by L. Fei-Fei and P. Perona. CVPR 2005.
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, by S. Lazebnik, C. Schmid, and J. Ponce, CVPR 2006.
Learning Spatial Context: Using Stuff to Find Things, by G. Heitz and D. Koller, ECCV 2008.
Contextual Priming for Object Detection, by A. Torralba. IJCV, 2003.
Object Categorization using Co-Occurrence, Location and Appearance, by C. Galleguillos, A. Rabinovich and S. Belongie, CVPR 2008.
Putting Objects in Perspective, by D. Hoiem, A. Efros, and M. Hebert, CVPR 2006.
IM2GPS: Estimating Geographic Information from a Single Image, by J. Hays and A. Efros. CVPR 2008.
80 Million Tiny Images: a Large Dataset for Non-Parametric Object and Scene Recognition. by A. Torralba, R. Fergus, and W. Freeman, PAMI 2008.
Scene Segmentation Using the Wisdom of Crowds, by I. Simon and S. Seitz. ECCV 2008.
Harvesting Image Databases from the Web, by F. Schroff, A. Criminisi, and A. Zisserman, ICCV 2007.
World-scale Mining of Objects and Events from Community Photo Collections, by T. Quack, B. Leibe, and L. Van Gool, CIVR 2008.
“Hello! My name is... Buffy” – Automatic Naming of Characters in TV Video, by M. Everingham, J. Sivic and A. Zisserman, BMVC 2006.
Names and Faces in the News, by T. Berg, A. Berg, J. Edwards, M. Maire, R. White, Y. Teh, E. Learned-Miller and D. Forsyth, CVPR 2004.
Movie/Script: Alignment and Parsing of Video and Text Transcription, by T. Cour, C. Jordan, E. Miltsakaki, and B. Taskar, ECCV 2008.
Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers, A. Gupta and L. Davis, ECCV 2008.
Discovering Objects and Their Location in Images, by J. Sivic, B. Russell, A. Efros, A. Zisserman, and W. Freeman, ICCV 2005.
Unsupervised Discovery of Action Classes, by Y. Wang, H. Jiang, M. Drew, Z-N. Li and G. Mori, CVPR 2006.
Detecting Irregularities in Images and in Video, by O. Boiman, M. Irani, ICCV 2005.
Scalable Recognition with a Vocabulary Tree, by D. Nister and H. Stewenius, CVPR 2006.
Fast Image Search for Learned Metrics. P. Jain, B. Kulis, and K. Grauman, CVPR 2008.
Efficient Near-Duplicate Detection and Sub-Image Retrieval. Y. Ke, R. Sukthankar, and L. Huston. Multimedia 2004. [pdf]
Small Codes and Large Image Databases for Recognition, by Torralba, A. , Fergus, R. and Weiss, Y. CVPR 2008.
Object Retrieval with Large Vocabularies and Fast Spatial Matching. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, CVPR 2007. [pdf]
Nonchronological Video Synopsis and Indexing, by Y. Pritch, A. Rav-Acha, and S. Peleg, TPAMI 2008.
CuZero: Embracing the Frontier of Interactive Visual Search for Informed Users, by E. Zavesky and S-F. Chang, MIR 2008.
Photo Tourism: Exploring Photo Collections in 3D, by N. Snavely, S. Seitz, and R. Szeliski, SIGGRAPH 2006.
Graph-Cut Transducers for Relevance Feedback in Content Based Image Retrieval, by H. Sahbi, J-Y. Audibert, R. Keriven, ICCV 2007. [pdf]
Autotagging Facebook: Social Network Context Improves Photo Annotation, by Z. Stone, T. Zickler, and T. Darrell. Internet Vision Workshop 2007.
Learning Tag Relevance by Neighbor Voting for Social Image Retrieval, by X. Li, C. Snoek, and M. Worring. MIR 2008.
Why We Tag: Motivations for Annotation in Mobile and Online Media, by M. Ames and M. Naaman, CHI 2007.