CS395T: Special Topics in Computer Vision, Spring 2010

Object Recognition



Course overview        Useful links        Syllabus        Detailed schedule        eGradebook        Blackboard


Meets:
Wednesdays 3:30-6:30 pm
ACES 3.408
Unique # 54470
 
Instructor: Kristen Grauman 
Email: grauman@cs
Office: CSA 114
 
TA: Sudheendra Vijayanarasimhan
Email: svnaras@cs
Office: CSA 106

When emailing us, please put CS395 in the subject line.

Announcements:

See the schedule for current reading assignments. 

Project paper drafts due Friday April 30.

Course overview:


Topics: This is a graduate seminar course in computer vision.   We will survey and discuss current vision papers relating to object recognition, auto-annotation of images, and scene understanding.  The goals of the course will be to understand current approaches to some important problems, to actively analyze their strengths and weaknesses, and to identify interesting open questions and possible directions for future research.

See the syllabus for an outline of the main topics we'll be covering.

Requirements: Students will be responsible for writing paper reviews each week, participating in discussions, completing one programming assignment, presenting once or twice in class (depending on enrollment, and possibly done in teams), and completing a project (done in pairs). 

Note that presentations are due one week before the slot your presentation is scheduled.  This means you will need to read the papers, prepare experiments, make plans with your partner, create slides, etc. more than one week before the date you are signed up for.  The idea is to meet and discuss ahead of time, so that we can iterate as needed the week leading up to your presentation. 

More details on the requirements and grading breakdown are here.  Information on the projects and project proposals is here.

Prereqs:  Courses in computer vision and/or machine learning (378 Computer Vision and/or 391 Machine Learning, or similar); ability to understand and analyze conference papers in this area; programming required for experiment presentations and projects. 

Please talk to me if you are unsure if the course is a good match for your background.  I generally recommend scanning through a few papers on the syllabus to gauge what kind of background is expected.  I don't assume you are already familiar with every single algorithm/tool/image feature a given paper mentions, but you should feel comfortable following the key ideas.


Syllabus overview:

  1. Single-object recognition fundamentals: representation, matching, and classification
    1. Specific objects
    2. Classification and global models
    3. Objects composed of parts
    4. Region-based methods
  2. Beyond single objects: recognizing categories in context and learning their properties
    1. Context
    2. Attributes
    3. Actions and objects/scenes
  3. Scalability issues in category learning, detection, and search
    1. Too many pixels!
    2. Too many categories!
    3. Too many images!
  4. Recognition and "everyday" visual data
    1. Landmarks, locations, and tourists
    2. Alignment with text
    3. Pictures of people


Schedule and papers:


Note:  * = required reading. 
Additional papers are provided for reference, and as a starting point for background reading for projects.
Paper presentations: focus on starred papers (additionally mentioning ideas from others is ok but not necessary).
Experiment presentations: Pick from only among the starred papers.
Date
Topics
Papers and links
Presenters
Items due
Jan 20
Course intro  handout


Topic preferences due via email by Monday Jan 25
I. Single-object recognition fundamentals: representation, matching, and classification
Jan 27
Recognizing specific objects:

Invariant local features, instance recognition, bag-of-words models

sift
  • *Object Recognition from Local Scale-Invariant Features, Lowe, ICCV 1999.  [pdf]  [code] [other implementations of SIFT] [IJCV]
  • *Local Invariant Feature Detectors: A Survey, Tuytelaars and Mikolajczyk.  Foundations and Trends in Computer Graphics and Vision, 2008. [pdf]  [Oxford code] [Read pp. 178-188, 216-220, 254-255]
  • *Video Google: A Text Retrieval Approach to Object Matching in Videos, Sivic and Zisserman, ICCV 2003.  [pdf]  [demo]
  • Scalable Recognition with a Vocabulary Tree, D. Nister and H. Stewenius, CVPR 2006. [pdf]
  • SURF: Speeded Up Robust Features, Bay, Ess, Tuytelaars, and Van Gool, CVIU 2008.  [pdf] [code]
  • Robust Wide Baseline Stereo from Maximally Stable Extremal Regions, J. Matas, O. Chum, U. Martin, and T. Pajdla, BMVC 2002.  [pdf]
  • A Performance Evaluation of Local Descriptors. K. Mikolajczyk and C. Schmid.  CVPR 2003 [pdf]
  • Oxford group interest point software
  • Andrea Vedaldi's code, including SIFT, MSER, hierarchical k-means.
  • INRIA LEAR team's software, including interest points, shape features
  • Semantic Robot Vision Challenge links
lecture slides [ppt] [pdf]

Feb 3
Recognition via classification and global models:

Global appearance models for category and scene recognition, sliding window detection, detection as a binary decision.

hog
  • *Histograms of Oriented Gradients for Human Detection, Dalal and Triggs, CVPR 2005.  [pdf]  [video] [code] [PASCAL datasets]
  • *Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, Lazebnik, Schmid, and Ponce, CVPR 2006. [pdf]  [15 scenes dataset]  [libpmk] [Matlab]
  • *Rapid Object Detection Using a Boosted Cascade of Simple Features, Viola and Jones, CVPR 2001.  [pdf]  [code]
  • Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope, Oliva and Torralba, IJCV 2001.  [pdf]  [Gist code
  • Visual Categorization with Bags of Keypoints, C. Dance, J. Willamowski, L. Fan, C. Bray, and G. Csurka, ECCV International Workshop on Statistical Learning in Computer Vision, 2004.  [pdf]
  • Pedestrian Detection in Crowded Scenes, Leibe, Seemann, and Schiele, CVPR 2005.  [pdf]
  • Pyramids of Histograms of Oriented Gradients (pHOG), Bosch and Zisserman. [code]
  • Eigenfaces for Recognition, Turk and Pentland, 1991.  [pdf]
  • Sampling Strategies for Bag-of-Features Image Classification.  E. Nowak, F. Jurie, and B. Triggs.  ECCV 2006. [pdf]
  • A Trainable System for Object Detection, C. Papageorgiou and T. Poggio, IJCV 2000.  [pdf]
  • Object Recognition with Features Inspired by Visual Cortex. T. Serre, L. Wolf and T. Poggio. CVPR 2005.  [pdf]
  • LIBPMK feature extraction code, includes dense sampling
  • LIBSVM library for support vector machines
lecture slides [ppt] [pdf]

Feb 10
Class begins at 5 pm today.
Objects composed of parts:

Part-based models for category
recognition, and local feature matching for correspondence-based recognition

parts
  • *A Discriminatively Trained, Multiscale, Deformable Part Model, by P. Felzenszwalb,  D.  McAllester and D. Ramanan.   CVPR 2008.  [pdf]  [code
  • *Combined Object Categorization and Segmentation with an Implicit Shape Model, by B. Leibe, A. Leonardis, and B. Schiele.   ECCV Workshop on Statistical Learning in Computer Vision, 2004.   [pdf]  [code]  [IJCV extended version]
  • *Learning a Dense Multi-View Representation for Detection, Viewpoint Classification and Synthesis of Object Categories, H. Su, M. Sun, L. Fei-Fei, S. Savarese.  ICCV 2009.  [pdf]
  • Shape Matching and Object Recognition with Low Distortion Correspondences, A. Berg, T. Berg, and J. Malik, CVPR 2005.  [pdf]  [web]
  • Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification, Frome, Singer, Sha, Malik.  ICCV 2007.  [pdf]
  • Matching Local Self-Similarities Across Images and Videos, Shechtman and Irani, CVPR 2007.  [pdf]
  • The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features, Grauman and Darrell.  ICCV 2005.  [pdf]  [web]  [code]
  • Shape Matching  and Object Recognition Using Shape Contexts.  S. Belongie, J. Malik, J. Puzicha.  PAMI 2002.  [pdf]
  • Multiple Component Learning for Object Detection, Dollar, Babenko, Belongie, Perona, and Tu, ECCV 2008.  [pdf]
  • Object Class Recognition by Unsupervised Scale Invariant Learning, by R. Fergus, P. Perona, and A. Zisserman.  CVPR 2003.  [pdf]  [datasets]
  • Efficient Matching of Pictorial Structures. P. Felzenszwalb and D. Huttenlocher. CVPR 2000.  [pdf] [related code]
  • A Boundary-Fragment-Model for Object Detection, Opelt, Pinz, and Zisserman, ECCV 2006.  [pdf]

Implementation assignment due Friday Feb 12, 5 PM
Feb 17
Region-based models:

Regions as parts, multi-label segmentation, integrated classification and segmentation

regions
  • *Recognition Using Regions.  C. Gu, J. Lim, P. Arbelaez, J. Malik, CVPR 2009.  [pdf]  [slides] [seg code]
  • *Using Multiple Segmentations to Discover Objects and their Extent in Image Collections, B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman, and A. Zisserman.  CVPR 2006.  [pdf] [code]
  • *Combining Top-down and Bottom-up Segmentation. E. Borenstein, E. Sharon, and S. Ullman.  CVPR  workshop 2004.  [pdf]  [data]
  • Extracting Subimages of an Unknown Category from a Set of Images, S. Todorovic and N. Ahuja, CVPR 2006.  [pdf]
  • Class-Specific, Top-Down Segmentation, E. Borenstein and S. Ullman, ECCV 2002.  [pdf]
  • Object Recognition by Integrating Multiple Image Segmentations, C. Pantofaru, C. Schmid, and M. Hebert, ECCV 2008  [pdf]
  • Image Parsing: Unifying Segmentation, Detection, and Recognition. Tu, Z., Chen, Z., Yuille, A.L., Zhu, S.C. ICCV 2003  [pdf]
  • Robust Higher Order Potentials for Enforcing Label Consistency, P. Kohli, L. Ladicky, and P. Torr. CVPR 2008.  
  • Co-segmentation of Image Pairs by Histogram Matching --Incorporating a Global Constraint into MRFs, C. Rother, V. Kolmogorov, T. Minka, and A. Blake.  CVPR 2006.  [pdf]
  • An Efficient Algorithm for Co-segmentation, D. Hochbaum, V. Singh, ICCV 2009.  [pdf]
  • Normalized Cuts and Image Segmentation, J. Shi and J. Malik.  PAMI 2000.  [pdf]  [code]
  • Greg Mori's superpixel code
  • Berkeley Segmentation Dataset and code
  • Pedro Felzenszwalb's graph-based segmentation code
  • Michael Maire's segmentation code and paper
  • Mean-shift: a Robust Approach Towards Feature Space Analysis [pdf]  [code, Matlab interface by Shai Bagon]
  • David Blei's Topic modeling code
papers: John [pdf]
demo: Sudheendra [ppt]

II. Beyond single objects: recognizing categories in context and learning their properties
Feb 24
Context:

Inter-object relationships, objects within scenes, geometric context, understanding scene layout

context
  • *Discriminative Models for Multi-Class Object Layout, C. Desai, D. Ramanan, C. Fowlkes. ICCV 2009.  [pdf]  [slides]  [SVM struct code] [data]
  • *TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation.  J. Shotton, J. Winn, C. Rother, A. Criminisi.  ECCV 2006.  [pdf] [web] [data]
  • *Geometric Context from a Single Image, by D. Hoiem, A. Efros, and M. Hebert, ICCV 2005. [pdf]  [web]  [code]
  • *Contextual Priming for Object Detection, A. Torralba.  IJCV 2003.  [pdf] [web] [code]
  • Putting Objects in Perspective, by D. Hoiem, A. Efros, and M. Hebert, CVPR 2006.  [pdf] [web]
  • Decomposing a Scene into Geometric and Semantically Consistent Regions, S. Gould, R. Fulton, and D. Koller, ICCV 2009.  [pdf]  [slides]
  • Learning Spatial Context: Using Stuff to Find Things, by G. Heitz and D. Koller, ECCV 2008.  [pdf] [code]
  • An Empirical Study of Context in Object Detection, S. Divvala, D. Hoiem, J. Hays, A. Efros, M. Hebert, CVPR 2009.  [pdf]  [web]
  • Object Categorization using Co-Occurrence, Location and Appearance, by C. Galleguillos, A. Rabinovich and S. Belongie, CVPR 2008.[ pdf]
  • Context Based Object Categorization: A Critical SurveyC. Galleguillos and S. Belongie.  [pdf]
  • What, Where and Who? Classifying Events by Scene and Object Recognition, L.-J. Li and L. Fei-Fei, ICCV 2007. [pdf]
  • Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Unsupervised Framework, L-J. Li, R. Socher, L. Fei-Fei, CVPR 2009.  [pdf]
Piyush [ppt]
Robert [pdf]

Mar 3
Attributes:

Visual properties, learning from natural language descriptions, intermediate representations

attributes
  • *Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer, C. Lampert, H. Nickisch, and S. Harmeling, CVPR 2009  [pdf] [web] [data]
  • *Describing Objects by Their Attributes, A. Farhadi, I. Endres, D. Hoiem, and D. Forsyth, CVPR 2009.  [pdf]  [web] [data]
  • *Attribute and Simile Classifiers for Face Verification, N. Kumar, A. Berg, P. Belhumeur, S. Nayar.  ICCV 2009.  [pdf] [web] [data]
  • Learning Visual Attributes, V. Ferrari and A. Zisserman, NIPS 2007.  [pdf]
  • Learning Color Names for Real-World Applications, J. van de Weijer, C. Schmid, J. Verbeek, and D. Larlus.  IEEE TIP 2009.  [pdf]  [web]
  • Learning Models for Object Recognition from Natural Language Descriptions, J. Wang, K. Markert, and M. Everingham, BMVC 2009.[pdf]
Brian [ppt]
Adam [pdf]

Friday
Mar 5
Prof. David Forsyth, UIUC
Forum for AI Talk
11 AM in ACES 2.302



Monday
Mar 8




Project proposal abstract due
Mar 10 Actions and objects/scenes:

Recognizing human actions and objects simultaneously, objects and scenes as context for the activity

actions
  • *Actions in Context, M. Marszalek, I. Laptev, C. Schmid.  CVPR 2009.  [pdf] [web]
  • *Objects in Action: An Approach for Combining Action Understanding and Object Perception.   A. Gupta and L. Davis.  CVPR, 2007.  [pdf]  [data]
  • Exploiting Human Actions and Object Context for Recognition Tasks.  D. Moore, I. Essa, and M. Hayes.  ICCV 1999.  [pdf]
  • A Scalable Approach to Activity Recognition Based on Object Use. J. Wu, A. Osuntogun, T. Choudhury, M. Philipose, and J. Rehg.  ICCV 2007.  [pdf]
  • Towards Using Multiple Cues for Robust Object Recognition, S. Aboutalib and M. Veloso, AAMAS 2007.  [pdf]
Aibo [ppt]

Mar 17
Spring break (no class)



III. Scalability issues in category learning, detection, and search
Mar 24
Too many pixels!

Bottom-up and top-down saliency measures to prioritize features, object importance, saliency in visual search tasks

saliency
  • *A Model of Saliency-based Visual Attention for Rapid Scene Analysis.  L. Itti, C. Koch, and E. Niebur.  PAMI 1998  [pdf]
  • *Some Objects are More Equal Than Others: Measuring and Predicting Importance, M. Spain and P. Perona.  ECCV 2008.  [pdf]
  • *Optimal Scanning for Faster Object Detection,  N. Butko, J. Movellan.  CVPR 2009.  [pdf]
  • Reading Between the Lines: Object Localization Using Implicit Cues from Image Tags.  S. J. Hwang and K. Grauman.  CVPR 2010.  [pdf]
  • Beyond Sliding Windows: Object Localization by Efficient Subwindow Search, C. Lampert, M. Blaschko, T. Hofmann.  CVPR 2008.  [pdf]
  • Peripheral-Foveal Vision for Real-time Object Recognition and Tracking in Video.  S. Gould, J. Arfvidsson, A. Kaehler, B. Sapp, M. Messner, G. Bradski, P. Baumstrack,S. Chung, A. Ng.  IJCAI 2007.  [pdf]
  • Peekaboom: A Game for Locating Objects in Images, by L. von Ahn, R. Liu and M. Blum, CHI 2006. [pdf]  [web]
  • Determining Patch Saliency Using Low-Level Context, D. Parikh, L. Zitnick, and T. Chen. ECCV 2008.  [pdf]
  • Learning to Predict Where Humans Look, T. Judd, K. Ehinger, F. Durand, A. Torralba.  ICCV 2009.  [pdf] [web]
  • Visual Recognition and Detection Under Bounded Computational Resources, S. Vijayanarasimhan and A. Kapoor.  CVPR 2010.
  • Torralba Global Features and Attention
  • The Role of Top-down and Bottom-up Processes in Guiding Eye Movements during Visual Search, G. Zelinsky, W. Zhang, B. Yu, X. Chen, D. Samaras, NIPS 2005.  [pdf]

  • Amazon Mechanical Turk
  • Using Mechanical Turk with LabelMe
Anush [pdf]
Project update and extended outline due Friday Mar 26
Mar 31
Too many categories!

Scalable recognition with many object categories


shared
  • *Sharing Visual Features for Multiclass and Multiview Object Detection, A. Torralba, K. Murphy, W. Freeman, PAMI 2007.  [pdf]  [code]
  • *Cross-Generalization: Learning Novel Classes from a Single Example by Feature Replacement.  CVPR 2005.  [pdf]
  • *Constructing Category Hierarchies for Visual Recognition, M. Marszalek and C. Schmid.  ECCV 2008.  [pdf]  [web] [Caltech256]
  • Learning Generative Visual Models from Few Training Examples: an Incremental Bayesian Approach Tested on 101 Object Categories. L. Fei-Fei, R. Fergus, and P. Perona. CVPR Workshop on Generative-Model Based Vision. 2004.  [pdf] [Caltech101]
  • Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts. S. Fidler and A. Leonardis.  CVPR 2007  [pdf]
  • Exploiting Object Hierarchy: Combining Models from Different Category Levels, A. Zweig and D. Weinshall, ICCV 2007 [pdf]
  • Learning and Using Taxonomies for Fast Visual Categorization, G. Griffin and P. Perona, CVPR 2008.  [pdf]
  • Incremental Learning of Object Detectors Using a Visual Shape Alphabet.  Opelt, Pinz, and Zisserman, CVPR 2006.  [pdf]
  • Sequential Learning of Reusable Parts for Object Detection.  S. Krempp, D. Geman, and Y. Amit.  2002  [pdf]
  • ImageNet: A Large-Scale Hierarchical Image Database, J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei, CVPR 2009 [pdf]  [data]
Rui [ppt]
Patrick [ppt]
Week of Mar 29 - Apr 2:
Individual project update meetings (by appt)
Apr 7
Too many images!

Scalable image search with large databases


hash
  • *Kernelized Locality Sensitive Hashing for Scalable Image Search, by B. Kulis and K. Grauman, ICCV 2009 [pdf]  [code]
  • *Geometric Min-Hashing: Finding a (Thick) Needle in a Haystack, O. Chum, M. Perdoch, and J. Matas.  CVPR 2009.  [pdf]
  • *Detecting Objects in Large Image Collections and Videos by Efficient Subimage Retrieval, C. Lampert, ICCV 2009.  [pdf]  [code] [code]
  • 80 Million Tiny Images: A Large Dataset for Non-Parametric Object and Scene Recognition, by A. Torralba, R. Fergus, and W. Freeman.  PAMI 2008.  [pdf] [web]
  • Fast Image Search for Learned Metrics, P. Jain, B. Kulis, and K. Grauman, CVPR 2008.  [pdf]
  • Small Codes and Large Image Databases for Recognition, A. Torralba, R. Fergus, and Y. Weiss, CVPR 2008.  [pdf]
  • Object Retrieval with Large Vocabularies and Fast Spatial Matching.  J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, CVPR 2007.  [pdf]
  • LSH homepage
  • Nearest Neighbor Methods in Learning and Vision, Shakhnarovich, Darrell, and Indyk, editors.
Muhibur [ppt]

IV: Recognition and "everyday" visual data
Apr 14
Landmarks, locations, and tourist photographers:

Location recognition, cues from tourist photos, photographer biases, retrieval for landmarks, browsing and visualization

location
  • *Landmark Classification in Large-Scale Image Collections.  Y. Li, D. Crandall, D. Huttenlocher.  ICCV 2009.  [pdf]
  • *Image Sequence Geolocation with Human Travel Priors, E. Kalogerakis, O. Vesselova, J. Hays, A. Efros, A. Hertzmann.  ICCV 2009.  [pdf]  [web]
  • *Scene Summarization for Online Image Collections.  I. Simon, N. Snavely, S. Seitz.  ICCV 2007.  [pdf]  [web]
  • Mapping the World's Photos, D. Crandall, L. Backstrom, D. Huttenlocher, J. Kleinberg, WWW 2009.  [pdf]  [web]
  • Im2GPS: Estimating Geographic Information from a Single Image, J. Hays and A. Efros.  CVPR 2008.  [pdf]  [web]
  • Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval, Chum, Philbin, Sivic, Isard, and Zisserman, ICCV 2007.  [pdf
  • Scene Segmentation Using the Wisdom of Crowds, by I. Simon and S. Seitz.  ECCV 2008.  [pdf]
  • Photo Tourism: Exploring Photo Collections in 3D, by N. Snavely, S. Seitz, and R. Szeliski, SIGGRAPH 2006.  [pdf] [web]
  • Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs, by X. Li, C. Wu, C. Zach, S. Lazebnik, and J. Frahm, ECCV 2008.  [pdf]  [web]
  • City-Scale Location Recognition, G. Schindler, M. Brown, and R. Szeliski, CVPR 2007.  [pdf
  • Parsing Images of Architectural Scenes, A. Berg, F. Grabler, J. Malik.  ICCV 2007.  [pdf]
  • I Know What You Did Last Summer: Object-Level Auto-annotation of Holiday Snaps, S. Gammeter, L. Bossard, T.Quack, L. van Gool, ICCV 2009.  [pdf]
  • CVPR 2009 Workshop on Visual Place Categorization
  • Code for downloading Flickr images, by James Hays
  • UW Community Photo Collections homepage
Sarah [ppt]
Suyog [pdf]

Apr 21
Alignment with text:

Discovering the correspondence between words (and other language constructs) to images or video, using captions or subtitles as weak labels.

lang
  • *"'Who are you?' - Learning Person Specific Classifiers from Video, J. Sivic, M. Everingham, and A. Zisserman, CVPR 2009.  [pdf]
  • *Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers, A. Gupta and L. Davis, ECCV 2008.  [pdf]
  • *Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary, P. Duygulu, K. Barnard, N. de Freitas, D. Forsyth. ECCV 2002.  [pdf]  [data]
  • The Mathematics of Statistical Machine Translation: Parameter Estimation.  P. Brown, S. Della Pietro, V. Della Pietra, R. Mercer.  Association for Computational Linguistics, 1993.  [pdf]
  • Who's Doing What: Joint Modeling of Names and Verbs for Simultaneous Face and Pose Annotation.  L. Jie, B. Caputo, and V. Ferrari.  NIPS 2009.  [pdf]
  • Names and Faces in the News, by T. Berg, A. Berg, J. Edwards, M. Maire, R. White, Y. Teh, E. Learned-Miller and D. Forsyth, CVPR 2004.  [pdf]  [web]
  • Learning Sign Language by Watching TV (using weakly aligned subtitles), P. Buehler, M. Everingham, and A. Zisserman. CVPR 2009.  [pdf]  [data]
  • “Hello! My name is... Buffy” – Automatic Naming of Characters in TV Video, by M. Everingham, J. Sivic and A. Zisserman, BMVC 2006.  [pdf]  [web]  [data]
  • Using Closed Captions to Train Activity Recognizers that Improve Video Retrieval, S. Gupta and R. Mooney. CVPR Visual and Contextual Learning Workshop, 2009.  [pdf]
  • Systematic Evaluation of Machine Translation Methods for Image and Video Annotation, P. Virga, P. Duygulu, CIVR 2005.  [pdf]
  • Subrip for subtitle extraction
  • Reuters captioned photos
  • Sonal Gupta's data for commentary+video
Anish [pdf]
Chao-Yeh [ppt]

Friday
April 23
Prof. Martial Hebert, CMU
Forum for AI Talk
11 AM, TAY 3.128



Apr 28
Pictures of people:

Faces, consumer photo collections, tagging

faces
  • *Understanding Images of Groups of People, A. Gallagher and T. Chen, CVPR 2009.  [pdf]
  • *Contextual Identity Recognition in Personal Photo Albums. D. Anguelov, K.-C. Lee, S. Burak, Gokturk, and B. Sumengen. CVPR 2007.  [pdf]
  • *A Face Annotation Framework with Partial Clustering and Interactive Labeling.  R. X. Y. Tian,W. Liu, F.Wen, and X. Tang.  CVPR 2007.  [pdf] [web]
  • Autotagging Facebook: Social Network Context Improves Photo Annotation, by  Z. Stone, T. Zickler, and T. Darrell.  CVPR Internet Vision Workshop 2008.   [pdf]
  • Efficient Propagation for Face Annotation in Family Albums. L. Zhang, Y. Hu, M. Li, and H. Zhang.  MM 2004.  [pdf]
  • Using Group Prior to Identify People in Consumer Images, A. Gallagher, T. Chen,  CVPR Workshop on Semantic Learning Applications in Multimedia, 2007.  [pdf]
  • Leveraging Archival Video for Building Face Datasets, by D. Ramanan, S. Baker, and S. Kakade.  ICCV 2007.  [pdf]
  • Names and Faces in the News, by T. Berg, A. Berg, J. Edwards, M. Maire, R. White, Y. Teh, E. Learned-Miller and D. Forsyth, CVPR 2004.  [pdf]  [web]
  • Face detection code in OpenCV
  • Gallagher's Person Dataset
MingJun

Friday
April 30




Final paper drafts due
May 5
Course wrap-up
Project presentations, part I


Presentations due
May 13



Final papers due


Other useful links:

 
 
Related courses:
 
Past semesters at UT:
 
By colleagues elsewhere: