WhittleSearch: Interactive Image Search with Relative Attribute Feedback. A. Kovashka, D. Parikh, and K. Grauman. International Journal on Computer Vision (IJCV), Volume 115, Issue 2, pp 185-210, November 2015. [link] [arxiv]

Attribute Pivots for Guiding Relevance Feedback in Image Search. A. Kovashka and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf] [patented]

Attribute Adaptation for Personalized Image Search. A. Kovashka and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf]

Implied Feedback: Learning Nuances of User Behavior in Image Search. D. Parikh and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf]

WhittleSearch: Image Search with Relative Attribute Feedback. A. Kovashka, D. Parikh, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf] [supp] [patented]

Learning Binary Hash Codes for Large-Scale Image Search. K. Grauman and R. Fergus. Book chapter, in Machine Learning for Computer Vision, Ed., R. Cipolla, S. Battiato, and G. Farinella, Studies in Computational Intelligence Series, Springer, Volume 411, pp. 49-87, 2013 [pdf] [link]

Efficient Region Search for Object Detection. S. Vijayanarasimhan and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. [pdf]

Kernelized Locality-Sensitive Hashing for Scalable Image Search. B. Kulis and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Kyoto, Japan, October, 2009. [pdf]

Kernelized Locality-Sensitive Hashing. B. Kulis and K. Grauman. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 34, No. 6, June 2012. [link]

Learning Binary Hash Codes for Large-Scale Image Search. K. Grauman and R. Fergus. Book chapter, in Machine Learning for Computer Vision, Ed., R. Cipolla, S. Battiato, and G. Farinella, Studies in Computational Intelligence Series, Springer, Volume 411, pp. 49-87, 2013 [pdf] [link]

Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning. P. Jain, S. Vijayanarasimhan, and K. Grauman. In Advances in Neural Information Processing Systems (NIPS), Vancouver, Canada, December 2010. [pdf]

Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning. S. Vijayanarasimhan, P. Jain, and K. Grauman. Transactions on Pattern Analysis and Machine Intelligence (PAMI), Volume 36, No. 2, pp. 276-288, February 2014.

ClickCarving: Interactive Object Segmentation in Images and Videos with Point Clicks. S. Jain and K. Grauman. International Journal of Computer Vision (IJCV), Issue 9, 2019. [link]

Predicting How to Distribute Work Between Algorithms and Humans to Segment an Image Batch. D. Gurari, Y. Zhao, S. Jain, M. Betke, and K. Grauman. International Journal of Computer Vision (IJCV), Mar 2019. [link] [arXiv]

Thinking Outside the Pool: Active Training Image Creation for Relative Attributes. A. Yu and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, June 2019. [pdf] [supp] [code/data]

Predicting Foreground Object Ambiguity and Efficiently Crowdsourcing the Segmentation(s). D. Gurari, K. He, B. Xiong, J. Zhang, M. Sameki, S. Jain, S. Sclaroff, M. Betke, and K. Grauman. International Journal of Computer Vision (IJCV), July 2018. [link]

CrowdVerge: Predicting If People Will Agree on the Answer to a Visual Question. D. Gurari and K. Grauman. ACM Conference on Human Factors in Computing Systems (CHI), Denver, CO, May 2017. Honorable Mention Award [pdf]

Crowdsourcing in Computer Vision. A. Kovashka, O. Russakovsky, L. Fei-Fei, and K. Grauman. Foundations and Trends in Computer Graphics and Vision, Nov 2016. [link] [arxiv] [pdf]

Click Carving: Segmenting Objects in Video with Point Clicks. S. D. Jain and K. Grauman. In Proceedings of the Fourth AAAI Conference on Human Computation and Crowdsourcing (HCOMP), Austin, TX, October 2016. [pdf]

Active Image Segmentation Propagation. S. D. Jain and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, June 2016. [pdf]

Pull the Plug? Predicting If Computers or Humans Should Segment Images. D. Gurari, S. Jain, M. Betke, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, June 2016. [pdf] [supp]

WhittleSearch: Interactive Image Search with Relative Attribute Feedback. A. Kovashka, D. Parikh, and K. Grauman. International Journal on Computer Vision (IJCV), Volume 115, Issue 2, pp 185-210, November 2015. [link] [arxiv]

Zero-shot Recognition with Unreliable Attributes. D. Jayaraman and K. Grauman. In Advances in Neural Information Processing Systems (NIPS), Montreal, Canada, Dec 2014. [pdf] [supp]

Beyond Comparing Image Pairs: Setwise Active Learning for Relative Attributes. L. Liang and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, June 2014. [pdf]

Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation. S. Jain and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf]

Active Learning of an Action Detector from Untrimmed Videos. S. Bandla and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf]

Attribute Pivots for Guiding Relevance Feedback in Image Search. A. Kovashka and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf] [patented]

Active Frame Selection for Label Propagation in Videos. S. Vijayanarasimhan and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Florence, Italy, October 2012.

WhittleSearch: Image Search with Relative Attribute Feedback. A. Kovashka, D. Parikh, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf] [supp] [patented]

Annotator Rationales for Visual Recognition. J. Donahue and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Barcelona, Spain, November 2011. [pdf]

Actively Selecting Annotations Among Objects and Attributes. A. Kovashka, S. Vijayanarasimhan, and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Barcelona, Spain, November 2011. [pdf]

Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds. S. Vijayanarasimhan and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. (Oral) [pdf]

Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds. S. Vijayanarasimhan and K. Grauman. International Journal of Computer Vision (IJCV), Volume 108, Issue 1-2, pp. 97-114, May 2014. [link]

Interactively Building a Discriminative Vocabulary of Nameable Attributes. D. Parikh and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. [pdf]

Discovering Localized Attributes for Fine-grained Recognition. K. Duan, D. Parikh, D. Crandall, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf]

Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning. P. Jain, S. Vijayanarasimhan, and K. Grauman. In Advances in Neural Information Processing Systems (NIPS), Vancouver, Canada, December 2010. [pdf]

Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning. S. Vijayanarasimhan, P. Jain, and K. Grauman. Transactions on Pattern Analysis and Machine Intelligence (PAMI), Volume 36, No. 2, pp. 276-288, February 2014.

Far-Sighted Active Learning on a Budget for Image and Video Recognition. S. Vijayanarasimhan, P. Jain, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. [pdf]

Co-Separating Sounds of Visual Objects. R. Gao and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, Nov 2019. [pdf] [supp]

Learning to Separate Object Sounds by Watching Unlabeled Video. R. Gao, R. Feris, and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, Sept 2018. (Oral) [pdf] [videos]

Discovering Important People and Objects for Egocentric Video Summarization. Y. J. Lee, J. Ghosh, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf]

Learning the Easy Things First: Self-Paced Visual Category Discovery. Y. J. Lee and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. [pdf]

Object-Graphs for Context-Aware Category Discovery. Y. J. Lee and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. (Oral) [pdf]

Object-Graphs for Context-Aware Category Discovery. Y. J. Lee and K. Grauman. In IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 34, No. 2, pp. 346-358, February 2012. [link]

Collect-Cut: Segmentation with Top-Down Cues Discovered in Multi-Object Images. Y. J. Lee and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. [pdf]

Face Discovery with Social Context. Y. J. Lee and K. Grauman. In Proceedings of the British Machine Vision Conference (BMVC), Dundee, U.K., August 2011. [pdf]

Foreground Focus: Unsupervised Learning from Partially Matching Images. Y. J. Lee and K. Grauman. In International Journal of Computer Vision (IJCV), Vol. 85, No. 2, 2009. [link]

Observe Locally, Infer Globally: a Space-Time MRF for Detecting Abnormal Activities with Incremental Updates. J. Kim and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, June 2009. [pdf]

Shape Discovery from Unlabeled Image Collections. Y. J. Lee and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, June 2009. [pdf]

Foreground Focus: Finding Meaningful Features in Unlabeled Images. Y. J. Lee and K. Grauman. In Proceedings of the British Machine Vision Conference (BMVC), Leeds, U.K., September 2008. (Oral) [pdf]

Keywords to Visual Categories: Multiple-Instance Learning for Weakly Supervised Object Categorization. S. Vijayanarasimhan and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, Alaska, June 2008. [pdf]

Watch, Listen & Learn: Co-training on Captioned Images and Videos. S. Gupta, J. Kim, K. Grauman, and R. Mooney. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML), Antwerp, Belgium, September 2008. [pdf]

Unsupervised Learning of Categories from Sets of Partially Matching Image Features. K. Grauman and T. Darrell. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New York City, NY, June 2006. (Oral) [pdf]

Single-Stage Visual Query Localization in Egocentric Videos. Hanwen Jiang, Santhosh Kumar Ramakrishnan, and Kristen Grauman. NeurIPS 2023. [pdf]

Boundary Preserving Dense Local Regions. J. Kim and K. Grauman. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2015. [link]

Deformable Spatial Pyramid Matching for Fast Dense Correspondences. J. Kim, C. Liu, F. Sha, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, June 2013. [pdf]

Boundary-Preserving Dense Local Regions. J. Kim and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. (Oral) [pdf]

Asymmetric Region-to-Image Matching for Comparing Images with Generic Object Categories. J. Kim and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. [pdf]

Clues from the Beaten Path: Location Estimation with Bursty Sequences of Tourist Photos. C.-Y. Chen and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. [pdf]

ClickCarving: Interactive Object Segmentation in Images and Videos with Point Clicks. S. Jain and K. Grauman. International Journal of Computer Vision (IJCV), Issue 9, 2019. [link]

Predicting How to Distribute Work Between Algorithms and Humans to Segment an Image Batch. D. Gurari, Y. Zhao, S. Jain, M. Betke, and K. Grauman. International Journal of Computer Vision (IJCV), Mar 2019. [link]

Pixel Objectness: Learning to Segment Generic Objects Automatically in Images and Videos. B. Xiong, S. Jain, and K. Grauman. To appear, Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2018. [code-imgs] [code-video]

Predicting Foreground Object Ambiguity and Efficiently Crowdsourcing the Segmentation(s). D. Gurari, K. He, B. Xiong, J. Zhang, M. Sameki, S. Jain, S. Sclaroff, M. Betke, and K. Grauman. International Journal of Computer Vision (IJCV), 2018. [link]

Pixel Objectness. S. Jain, B. Xiong, and K. Grauman. arXiv. Jan 2017 patent pending

FusionSeg: Learning to Combine Motion and Appearance for Fully Automatic Segmentation of Generic Objects in Video. S. Jain, B. Xiong, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, July 2017. [pdf] [DAVIS results leaderboard] patent pending

Detangling People: Individuating Multiple Close People and Their Body Parts via Region Assembly. H. Jiang and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, July 2017. (Oral) [pdf]

Click Carving: Segmenting Objects in Video with Point Clicks. S. D. Jain and K. Grauman. In Proceedings of the Fourth AAAI Conference on Human Computation and Crowdsourcing (HCOMP), Austin, TX, October 2016. [pdf]

Pull the Plug? Predicting If Computers or Humans Should Segment Images. D. Gurari, S. Jain, M. Betke, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, June 2016. [pdf] [supp]

Active Image Segmentation Propagation. S. Jain and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, June 2016. [pdf]

Which Image Pairs Will Cosegment Well? Predicting Partners for Cosegmentation. S. Jain and K. Grauman. In Proceedings of the Asian Conference on Computer Vision (ACCV), Singapore, Nov 2014. [pdf]

Supervoxel-Consistent Foreground Propagation in Video. S. Jain and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland, Sept 2014. [pdf]

Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation. S. Jain and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf]

Shape Sharing for Segmentation. J. Kim and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Florence, Italy, October 2012. (Oral) [pdf] [supp]

Key-Segments for Video Object Segmentation. Y. J. Lee, J. Kim, and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Barcelona, Spain, November 2011. [pdf]

Efficient Region Search for Object Detection. S. Vijayanarasimhan and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. [pdf]

Boundary Preserving Dense Local Regions. J. Kim and K. Grauman. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2015. [link]

Boundary-Preserving Dense Local Regions. J. Kim and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. (Oral) [pdf]

Asymmetric Region-to-Image Matching for Comparing Images with Generic Object Categories. J. Kim and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. [pdf]

Top-Down Pairwise Potentials for Piecing Together Multi-Class Segmentation Puzzles. S. Vijayanarasimhan and K.Grauman. In Proceedings of the Seventh IEEE Computer Society Workshop on Perceptual Organization in Computer Vision (POCV), June 2010. [pdf]

Object-Graphs for Context-Aware Category Discovery. Y. J. Lee and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. (Oral) [pdf]

Object-Graphs for Context-Aware Category Discovery. Y. J. Lee and K. Grauman. In IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2011. [link]

Collect-Cut: Segmentation with Top-Down Cues Discovered in Multi-Object Images. Y. J. Lee and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. [pdf]

Learning Object State Changes in Videos: An Open-World Perspective. Zihui Xue, Kumar Ashutosh, Kristen Grauman. CVPR 2024. [pdf] [project page]

Detours for Navigating Instructional Videos. Kumar Ashutosh, Zihui Xue, Tushar Nagarajan, Kristen Grauman. CVPR 2024 (Poster highlight) [pdf]

Video-Mined Task Graphs for Keystep Recognition in Instructional Videos. Kumar Ashutosh, Santhosh Kumar Ramakrishnan, Triantafyllos Afouras, Kristen Grauman. NeurIPS 2023. [pdf]

EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding. Shuhan Tan, Tushar Nagarajan, Kristen Grauman. NeurIPS 2023 [pdf]

What You Say Is What You Show: Visual Narration Detection in Instructional Videos. Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, Kristen Grauman. arXiv 2023. [pdf]

Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment. Zihui Xue and Kristen Grauman. NeurIPS 2023. [pdf]

SpotEM: Efficient Video Search for Episodic Memory. Santhosh Kumar Ramakrishnan, Ziad Al-Halah, Kristen Grauman. ICML 2023 [pdf] [project]

HierVL: Learning Hierarchical Video-Language Embeddings. Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, Kristen Grauman. CVPR 2023. [pdf] [project]

Ego4D: Around the World in 3,000 Hours of Egocentric Video. K Grauman et al. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022. (Oral) [pdf] [supp] [project page]

Egocentric Activity Recognition and Localization on a 3D Map. M. Liu, L. Ma, K. Somasundaram, Y. Li, K. Grauman, J. Rehg, C. Li. In Proceedings of the European Conference on Computer Vision (ECCV), 2022. [pdf] [project page]

Anticipative Video Transformer. R. Girdhar and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Oct 2021. [pdf] Winner of the EPIC-Kitchens CVPR'21 Action Anticipation Challenge

Multiview Pseudo-Labeling for Semi-supervised Learning from Video. B. Xiong, H. Fan, K. Grauman, C. Feichtenhofer. In Proceedings of the International Conference on Computer Vision (ICCV), Oct 2021.

Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos. Y. Li, T. Nagarajan, B. Xiong, K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.   [pdf]

Proposal-based Video Completion. Y-T. Hu, H. Wang, N. Ballas, K. Grauman, A. Schwing. In Proceedings of the European Conference on Computer Vision (ECCV), August 2020.

Ego-Topo: Environment Affordances from Egocentric Video. T. Nagarajan, Y. Li, C. Feichtenhofer, K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, June 2020. (Oral) [project page/dataset] [pdf] [supp]

You2Me: Inferring Body Pose in Egocentric Video via First and Second Person Interactions. E. Ng, D. Xiang, H. Joo, K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, June 2020. (Oral) [project page/dataset] [pdf]

Listen to Look: Action Recognition by Previewing Audio. R. Gao, T-H. Oh, K. Grauman, L. Torresani. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, June 2020. [pdf]

Learning Compressible 360 Video Isomers. Y-C. Su and K. Grauman. Transactions on Pattern Analysis and Machine Intelligence (PAMI). Feb 2020. [link]

Grounded Human-Object Interaction Hotspots from Video. T. Nagarajan, C. Feichtenhofer, K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, Nov 2019. [pdf] [supp]

Co-Separating Sounds of Visual Objects. R. Gao and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, Nov 2019.   [pdf] [supp]

Learning to Separate Object Sounds by Watching Unlabeled Video. R. Gao, R. Feris, K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, Sept 2018. (Oral) [pdf] [videos]

Im2Flow: Motion Hallucination from Static Images for Action Recognition. R. Gao, B. Xiong, and K. Grauman. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, June 2018. (Oral) [pdf]

Learning Compressible 360 Video Isomers. Y-C. Su and K. Grauman. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, June 2018. [pdf] [supp] [data]

Seeing Invisible Poses: Estimating 3D Body Pose from Egocentric Video. H. Jiang and K. Grauman. To appear, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, July 2017. (Spotlight) [pdf]

Leaving Some Stones Unturned: Dynamic Feature Prioritization for Activity Detection in Streaming Video. Y-C. Su and K. Grauman. Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, October 2016. [pdf] [supp]

Slow and Steady Feature Analysis: Higher Order Temporal Coherence in Video. D. Jayaraman and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, June 2016. (Spotlight) [pdf]

Click Carving: Segmenting Objects in Video with Point Clicks. S. D. Jain and K. Grauman. In Proceedings of the Fourth AAAI Conference on Human Computation and Crowdsourcing (HCOMP), Austin, TX, October 2016. [pdf]

Efficient Activity Detection in Untrimmed Video with Max-Subgraph Search.  C-Y. Chen and K. Grauman. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI), April 2016.

Subjects and Their Objects: Localizing Interactees for a Person-Centric View of Importance. C-Y. Chen and K. Grauman. International Journal of Computer Vision (IJCV), Oct 2016. [link] [arxiv version]

Predicting the Location of "Interactees" in Novel Human-Object Interactions. C-Y. Chen and K. Grauman. In Proceedings of the Asian Conference on Computer Vision (ACCV), Singapore, Nov 2014. [pdf]

Inferring Unseen Views of People. C.-Y. Chen and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, June 2014. [pdf]

Supervoxel-Consistent Foreground Propagation in Video. S. Jain and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland, Sept 2014. [pdf]

Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots. C-Y. Chen and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, June 2013. (Oral) [pdf]

Active Learning of an Action Detector from Untrimmed Videos. S. Bandla and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf]

Active Frame Selection for Label Propagation in Videos. S. Vijayanarasimhan and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Florence, Italy, October 2012.

Efficient Activity Detection with Max-Subgraph Search. C.-Y. Chen and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf]

Object-Centric Spatio-Temporal Pyramids for Egocentric Activity Recognition. T. McCandless and K. Grauman. In Proceedings of the British Machine Vision Conference (BMVC), Bristol, UK, September 2013. [pdf]

Discovering Important People and Objects for Egocentric Video Summarization. Y. J. Lee, J. Ghosh, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf]

Key-Segments for Video Object Segmentation. Y. J. Lee, J. Kim, and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Barcelona, Spain, November 2011. [pdf]

Learning a Hierarchy of Discriminative Space-Time Neighborhood Features for Human Action Recognition. A. Kovashka and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. [pdf]

Far-Sighted Active Learning on a Budget for Image and Video Recognition. S. Vijayanarasimhan, P. Jain, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. [pdf]

Observe Locally, Infer Globally: a Space-Time MRF for Detecting Abnormal Activities with Incremental Updates. J. Kim and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, June 2009. [pdf]

Watch, Listen & Learn: Co-training on Captioned Images and Videos. S. Gupta, J. Kim, K. Grauman, and R. Mooney. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML), Antwerp, Belgium, September 2008. [pdf]

A Task-Driven Intelligent Workspace System to Provide Guidance Feedback. M. S. Ryoo, K. Grauman, and J. K. Aggarwal. Computer Vision and Image Understanding, 2010. [link]

Communication via Eye Blinks: Detection and Duration Analysis in Real Time. K. Grauman, M. Betke, J. Gips, and G. Bradski. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Lihue, HI, December 2001. [pdf]

Densifying Supervision for Fine-Grained Comparisons. A. Yu and K. Grauman. International Journal of Computer Vision (IJCV), Special Issue on Generative Adversarial Networks for Computer Vision, 2020. [pdf]

Don't Judge an Object by Its Context: Learning to Overcome Contextual Bias. K. Singh, D. Mahajan, K. Grauman, Y J. Lee, M. Feiszli, D. Ghadiyaram. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, June 2020. (Oral) [pdf]

Thinking Outside the Pool: Active Training Image Creation for Relative Attributes. A. Yu and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, June 2019. [pdf] [supp] [code/data]

Attributes as Operators. T. Nagarajan and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, Sept 2018. [pdf] [supp] [code]

Compare and Contrast: Learning Prominent Visual Differences. S. Chen and K. Grauman. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, June 2018. [pdf] [supp]

Semantic Jitter: Dense Supervision for Visual Comparisons via Synthetic Images. A. Yu and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, Oct 2017. [pdf] [supp]

Fine-Grained Comparisons with Attributes. A. Yu and K. Grauman. Chapter in Visual Attributes. R. Feris, C. Lampert, and D. Parikh, Editors. Springer. 2017. [pdf]

Divide, Share, and Conquer: Multi-task Attribute Learning with Selective Sharing. C-Y. Chen, Dinesh Jayaraman, F. Sha, and K. Grauman. Chapter in Visual Attributes. R. Feris, C. Lampert, and D. Parikh, Editors. Springer. 2017. [pdf]

Attributes for Image Retrieval. A. Kovashka and K. Grauman. Chapter in Visual Attributes. R. Feris, C. Lampert, and D. Parikh, Editors. Springer. 2017.

Just Noticeable Differences in Visual Attributes. A. Yu and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, Dec 2015. [pdf] [supp]

Zero-shot Recognition with Unreliable Attributes. D. Jayaraman and K. Grauman. In Advances in Neural Information Processing Systems (NIPS), Montreal, Canada, Dec 2014. [pdf] [supp]

Predicting Useful Neighborhoods for Lazy Local Learning. A. Yu and K. Grauman. In Advances in Neural Information Processing Systems (NIPS), Montreal, Canada, Dec 2014. [pdf] [supp]

Discovering Attribute Shades of Meaning with the Crowd. A. Kovashka and K. Grauman. International Journal on Computer Vision (IJCV), Volume 114, Issue 1, pp. 56-73, August 2015. [link] [arxiv]

Discovering Shades of Attribute Meaning with the Crowd. A. Kovashka and K. Grauman. Third International Workshop on Parts and Attributes, in conjunction with the European Conference on Computer Vision. Zurich, Switzerland, Sept 2014. [pdf]

Fine-Grained Visual Comparisons with Local Learning. A. Yu and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, June 2014. [pdf]

Decorrelating Semantic Visual Attributes by Resisting the Urge to Share. D. Jayaraman, F. Sha, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, June 2014. (Oral) [pdf]

Inferring Analogous Attributes. C.-Y. Chen and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, June 2014. [pdf]

Attribute Adaptation for Personalized Image Search. A. Kovashka and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf]

Analogy-Preserving Semantic Embedding for Visual Object Categorization. S. J. Hwang, K. Grauman, and F. Sha. In International Conference on Machine Learning (ICML), Atlanta, GA, June 2013. [pdf]

Semantic Kernel Forests from Multiple Taxonomies. S. J. Hwang, K. Grauman, and F. Sha. In Advances in Neural Information Processing Systems (NIPS), Tahoe, Nevada, December 2012. [pdf]

Semantic Kernel Forests from Multiple Taxonomies. S. J. Hwang, F. Sha, and K. Grauman. In Big Data Meets Computer Vision: First International Workshop on Large Scale Visual Recognition and Retrieval. In conjunction with NIPS, 2012. [pdf]

Discovering Localized Attributes for Fine-grained Recognition. K. Duan, D. Parikh, D. Crandall, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf]

Relative Attributes. D. Parikh and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Barcelona, Spain, November 2011. (Oral) [pdf] [Marr Prize, ICCV Best Paper Award]

Relative Attributes for Enhanced Human-Machine Communication. D. Parikh, A. Kovashka, A. Parkash, and K. Grauman. Invited paper, Proceedings of AAAI 2012, Sub-Area Spotlights Track for Best Papers. [pdf]

Sharing Features Between Objects and Their Attributes. S. J. Hwang, F. Sha, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. [pdf]

Learning with Whom to Share in Multi-task Feature Learning. Z. Kang, K. Grauman, and F. Sha. In Proceedings of the International Conference on Machine Learning (ICML), Bellevue, WA, July 2011. [pdf]

Accounting for the Relative Importance of Objects in Image Retrieval. S. J. Hwang and K. Grauman. In Proceedings of the British Machine Vision Conference (BMVC), Aberystwyth, UK, September 2010. (Oral) [pdf]

Learning the Relative Importance of Objects from Tagged Images for Retrieval and Cross-Modal Search. S. J. Hwang and K. Grauman. International Journal of Computer Vision (IJCV), Vol. 100, Issue 2, pp. 134-153, November 2012. [link]

Learning a Tree of Metrics with Disjoint Visual Features. S. J. Hwang, K. Grauman, F. Sha. In Advances in Neural Information Processing Systems (NIPS). Granada, Spain, December 2011. [pdf]


	UT-Austin Computer Vision Group Publications [view with images/code/slides] [view by topic] [view by year] [student theses]

	We are exploring problems in visual recognition and search. To this end, we are exploring these topics: Self-supervised feature learning from video Audio-visual video analysis Fashion image analysis 360 images and video Egocentric perception / first-person vision / embodied AI Video summarization Image search, recognition, and large-scale retrieval Active visual learning / human-in-the-loop Unsupervised visual discovery Image correspondences and matching Region-based recognition and segmentation Activity recognition and video understanding Domain adaptation and transfer learning Learning semantic visual representations, visual attributes Vision and language Other

	Image search and large-scale retrieval
	Single-Stage Visual Query Localization in Egocentric Videos. Hanwen Jiang, Santhosh Kumar Ramakrishnan, and Kristen Grauman. NeurIPS 2023. [pdf] BlockDrop: Dynamic Inference Paths in Residual Networks. Z. Wu, T. Nagarajan, A. Kumar, S. Rennie, L. Davis, K. Grauman, R. Feris. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, June 2018. (Spotlight) [pdf] [supp] [code] WhittleSearch: Interactive Image Search with Relative Attribute Feedback. A. Kovashka, D. Parikh, and K. Grauman. International Journal on Computer Vision (IJCV), Volume 115, Issue 2, pp 185-210, November 2015. [link] [arxiv] Attribute Pivots for Guiding Relevance Feedback in Image Search. A. Kovashka and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf] [patented] Attribute Adaptation for Personalized Image Search. A. Kovashka and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf] Implied Feedback: Learning Nuances of User Behavior in Image Search. D. Parikh and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf] WhittleSearch: Image Search with Relative Attribute Feedback. A. Kovashka, D. Parikh, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf] [supp] [patented] Learning Binary Hash Codes for Large-Scale Image Search. K. Grauman and R. Fergus. Book chapter, in Machine Learning for Computer Vision, Ed., R. Cipolla, S. Battiato, and G. Farinella, Studies in Computational Intelligence Series, Springer, Volume 411, pp. 49-87, 2013 [pdf] [link] Efficient Region Search for Object Detection. S. Vijayanarasimhan and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. [pdf] Kernelized Locality-Sensitive Hashing for Scalable Image Search. B. Kulis and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Kyoto, Japan, October, 2009. [pdf] Kernelized Locality-Sensitive Hashing. B. Kulis and K. Grauman. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 34, No. 6, June 2012. [link] Learning Binary Hash Codes for Large-Scale Image Search. K. Grauman and R. Fergus. Book chapter, in Machine Learning for Computer Vision, Ed., R. Cipolla, S. Battiato, and G. Farinella, Studies in Computational Intelligence Series, Springer, Volume 411, pp. 49-87, 2013 [pdf] [link] Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning. P. Jain, S. Vijayanarasimhan, and K. Grauman. In Advances in Neural Information Processing Systems (NIPS), Vancouver, Canada, December 2010. [pdf] Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning. S. Vijayanarasimhan, P. Jain, and K. Grauman. Transactions on Pattern Analysis and Machine Intelligence (PAMI), Volume 36, No. 2, pp. 276-288, February 2014. Fast Similarity Search for Learned Metrics. B. Kulis, P. Jain, and K. Grauman. In IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 31, No. 12, December, 2009. [link] Accounting for the Relative Importance of Objects in Image Retrieval. S. J. Hwang and K. Grauman. In Proceedings of the British Machine Vision Conference (BMVC), Aberystwyth, UK, September 2010. (Oral) [pdf] Learning the Relative Importance of Objects from Tagged Images for Retrieval and Cross-Modal Search. S. J. Hwang and K. Grauman. International Journal of Computer Vision (IJCV), published online October 2011. [link] Efficiently Searching for Similar Images. K. Grauman. Invited article in the Communications of the ACM, 2009. [pdf] Online Metric Learning and Fast Similarity Search. P. Jain, B. Kulis, I. Dhillon, and K. Grauman. In Advances in Neural Information Processing Systems (NIPS), Vancouver, Canada, December 2008. (Oral) [pdf] Fast Image Search for Learned Metrics. P. Jain, B. Kulis, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, Alaska, June 2008. (Oral) [Best Student Paper Award] [pdf] Pyramid Match Hashing: Sub-Linear Time Indexing Over Partial Correspondences. K. Grauman and T. Darrell. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, MN, June 2007. [pdf] A Picture is Worth a Thousand Keywords: Image-Based Object Search on a Mobile Platform. T. Yeh, K. Grauman, K. Tollmar, and T. Darrell. In CHI 2005, Conference on Human Factors in Computing Systems, Portland, OR, April 2005. [pdf]

	Active and interactive visual learning, human-in-the-loop
	ClickCarving: Interactive Object Segmentation in Images and Videos with Point Clicks. S. Jain and K. Grauman. International Journal of Computer Vision (IJCV), Issue 9, 2019. [link] Predicting How to Distribute Work Between Algorithms and Humans to Segment an Image Batch. D. Gurari, Y. Zhao, S. Jain, M. Betke, and K. Grauman. International Journal of Computer Vision (IJCV), Mar 2019. [link] [arXiv] Thinking Outside the Pool: Active Training Image Creation for Relative Attributes. A. Yu and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, June 2019. [pdf] [supp] [code/data] Predicting Foreground Object Ambiguity and Efficiently Crowdsourcing the Segmentation(s). D. Gurari, K. He, B. Xiong, J. Zhang, M. Sameki, S. Jain, S. Sclaroff, M. Betke, and K. Grauman. International Journal of Computer Vision (IJCV), July 2018. [link] CrowdVerge: Predicting If People Will Agree on the Answer to a Visual Question. D. Gurari and K. Grauman. ACM Conference on Human Factors in Computing Systems (CHI), Denver, CO, May 2017. Honorable Mention Award [pdf] Crowdsourcing in Computer Vision. A. Kovashka, O. Russakovsky, L. Fei-Fei, and K. Grauman. Foundations and Trends in Computer Graphics and Vision, Nov 2016. [link] [arxiv] [pdf] Click Carving: Segmenting Objects in Video with Point Clicks. S. D. Jain and K. Grauman. In Proceedings of the Fourth AAAI Conference on Human Computation and Crowdsourcing (HCOMP), Austin, TX, October 2016. [pdf] Active Image Segmentation Propagation. S. D. Jain and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, June 2016. [pdf] Pull the Plug? Predicting If Computers or Humans Should Segment Images. D. Gurari, S. Jain, M. Betke, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, June 2016. [pdf] [supp] WhittleSearch: Interactive Image Search with Relative Attribute Feedback. A. Kovashka, D. Parikh, and K. Grauman. International Journal on Computer Vision (IJCV), Volume 115, Issue 2, pp 185-210, November 2015. [link] [arxiv] Zero-shot Recognition with Unreliable Attributes. D. Jayaraman and K. Grauman. In Advances in Neural Information Processing Systems (NIPS), Montreal, Canada, Dec 2014. [pdf] [supp] Beyond Comparing Image Pairs: Setwise Active Learning for Relative Attributes. L. Liang and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, June 2014. [pdf] Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation. S. Jain and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf] Active Learning of an Action Detector from Untrimmed Videos. S. Bandla and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf] Attribute Pivots for Guiding Relevance Feedback in Image Search. A. Kovashka and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf] [patented] Active Frame Selection for Label Propagation in Videos. S. Vijayanarasimhan and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Florence, Italy, October 2012. WhittleSearch: Image Search with Relative Attribute Feedback. A. Kovashka, D. Parikh, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf] [supp] [patented] Annotator Rationales for Visual Recognition. J. Donahue and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Barcelona, Spain, November 2011. [pdf] Actively Selecting Annotations Among Objects and Attributes. A. Kovashka, S. Vijayanarasimhan, and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Barcelona, Spain, November 2011. [pdf] Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds. S. Vijayanarasimhan and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. (Oral) [pdf] Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds. S. Vijayanarasimhan and K. Grauman. International Journal of Computer Vision (IJCV), Volume 108, Issue 1-2, pp. 97-114, May 2014. [link] Interactively Building a Discriminative Vocabulary of Nameable Attributes. D. Parikh and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. [pdf] Discovering Localized Attributes for Fine-grained Recognition. K. Duan, D. Parikh, D. Crandall, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf] Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning. P. Jain, S. Vijayanarasimhan, and K. Grauman. In Advances in Neural Information Processing Systems (NIPS), Vancouver, Canada, December 2010. [pdf] Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning. S. Vijayanarasimhan, P. Jain, and K. Grauman. Transactions on Pattern Analysis and Machine Intelligence (PAMI), Volume 36, No. 2, pp. 276-288, February 2014. Far-Sighted Active Learning on a Budget for Image and Video Recognition. S. Vijayanarasimhan, P. Jain, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. [pdf] Cost-Sensitive Active Visual Category Learning. S. Vijayanarasimhan and K. Grauman. International Journal of Computer Vision (IJCV), Vol. 91, Issue 1 (2011), p. 24. (online first July 2010). [link] Minimizing Annotation Costs in Visual Category Learning. S. Vijayanarasimhan and K. Grauman. Invited chapter, in Cost-Sensitive Machine Learning, B. Krishnapuram, S. Yu, and B. Rao, Editors. Chapman and Hall/CRC, December 2011. [link] Reading Between The Lines: Object Localization Using Implicit Cues from Image Tags. S. J. Hwang and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. (Oral) [pdf] Reading Between The Lines: Object Localization Using Implicit Cues from Image Tags. S. J. Hwang and K. Grauman. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 34, No. 6, pp. 1145-1158, June 2012. [link] What’s It Going to Cost You?: Predicting Effort vs. Informativeness for Multi-Label Image Annotations. S. Vijayanarasimhan and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, June 2009. [pdf] Cost-Sensitive Active Visual Category Learning. S. Vijayanarasimhan and K. Grauman. Abstract in the Learning Workshop (The Snowbird Workshop), Clearwater, FL, April 2009. [pdf] Multi-Level Active Prediction of Useful Image Annotations for Recognition. S. Vijayanarasimhan and K. Grauman. In Advances in Neural Information Processing Systems (NIPS), Vancouver, Canada, December 2008. (Oral) [pdf] Gaussian Processes for Object Categorization. A. Kapoor, K. Grauman, R. Urtasun, and T. Darrell. In International Journal of Computer Vision (IJCV), Vol. 88, No. 2, 2010. [link] Active Learning with Gaussian Processes for Object Categorization. A. Kapoor, K. Grauman, R. Urtasun, and T. Darrell. In Proceedings of the IEEE International Conference on Computer Vision, Rio de Janeiro, Brazil, October 2007. [pdf]

	Unsupervised and semi-supervised visual discovery
	Co-Separating Sounds of Visual Objects. R. Gao and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, Nov 2019. [pdf] [supp] Learning to Separate Object Sounds by Watching Unlabeled Video. R. Gao, R. Feris, and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, Sept 2018. (Oral) [pdf] [videos] Discovering Important People and Objects for Egocentric Video Summarization. Y. J. Lee, J. Ghosh, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf] Learning the Easy Things First: Self-Paced Visual Category Discovery. Y. J. Lee and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. [pdf] Object-Graphs for Context-Aware Category Discovery. Y. J. Lee and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. (Oral) [pdf] Object-Graphs for Context-Aware Category Discovery. Y. J. Lee and K. Grauman. In IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 34, No. 2, pp. 346-358, February 2012. [link] Collect-Cut: Segmentation with Top-Down Cues Discovered in Multi-Object Images. Y. J. Lee and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. [pdf] Face Discovery with Social Context. Y. J. Lee and K. Grauman. In Proceedings of the British Machine Vision Conference (BMVC), Dundee, U.K., August 2011. [pdf] Foreground Focus: Unsupervised Learning from Partially Matching Images. Y. J. Lee and K. Grauman. In International Journal of Computer Vision (IJCV), Vol. 85, No. 2, 2009. [link] Observe Locally, Infer Globally: a Space-Time MRF for Detecting Abnormal Activities with Incremental Updates. J. Kim and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, June 2009. [pdf] Shape Discovery from Unlabeled Image Collections. Y. J. Lee and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, June 2009. [pdf] Foreground Focus: Finding Meaningful Features in Unlabeled Images. Y. J. Lee and K. Grauman. In Proceedings of the British Machine Vision Conference (BMVC), Leeds, U.K., September 2008. (Oral) [pdf] Keywords to Visual Categories: Multiple-Instance Learning for Weakly Supervised Object Categorization. S. Vijayanarasimhan and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, Alaska, June 2008. [pdf] Watch, Listen & Learn: Co-training on Captioned Images and Videos. S. Gupta, J. Kim, K. Grauman, and R. Mooney. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML), Antwerp, Belgium, September 2008. [pdf] Unsupervised Learning of Categories from Sets of Partially Matching Image Features. K. Grauman and T. Darrell. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New York City, NY, June 2006. (Oral) [pdf]

	Image matching and local feature correspondences
	Single-Stage Visual Query Localization in Egocentric Videos. Hanwen Jiang, Santhosh Kumar Ramakrishnan, and Kristen Grauman. NeurIPS 2023. [pdf] Boundary Preserving Dense Local Regions. J. Kim and K. Grauman. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2015. [link] Deformable Spatial Pyramid Matching for Fast Dense Correspondences. J. Kim, C. Liu, F. Sha, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, June 2013. [pdf] Boundary-Preserving Dense Local Regions. J. Kim and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. (Oral) [pdf] Asymmetric Region-to-Image Matching for Comparing Images with Generic Object Categories. J. Kim and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. [pdf] Clues from the Beaten Path: Location Estimation with Bursty Sequences of Tourist Photos. C.-Y. Chen and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. [pdf] The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features. K. Grauman and T. Darrell. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Beijing, China, October 2005. (Oral) [pdf] Approximate Correspondences in High Dimensions. K. Grauman and T. Darrell. In Advances in Neural Information Processing Systems 19 (NIPS) 2007. [pdf] The Pyramid Match: Efficient Learning with Partial Correspondences. K. Grauman. In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI), (Nectar track, for AI results presented at other conferences in last two years), Vancouver, Canada, July 2007. [pdf] The Pyramid Match Kernel: Efficient Learning with Sets of Features. K. Grauman and T. Darrell. Journal of Machine Learning Research (JMLR), 8 (Apr): 725--760, 2007. [pdf] Efficient Image Matching with Distributions of Local Invariant Features. K. Grauman and T. Darrell. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, CA, June 2005. [pdf] Fast Contour Matching Using Approximate Earth Mover's Distance. K. Grauman and T. Darrell. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Washington DC, June 2004. [pdf]

	Region-based recognition and segmentation
	ClickCarving: Interactive Object Segmentation in Images and Videos with Point Clicks. S. Jain and K. Grauman. International Journal of Computer Vision (IJCV), Issue 9, 2019. [link] Predicting How to Distribute Work Between Algorithms and Humans to Segment an Image Batch. D. Gurari, Y. Zhao, S. Jain, M. Betke, and K. Grauman. International Journal of Computer Vision (IJCV), Mar 2019. [link] Pixel Objectness: Learning to Segment Generic Objects Automatically in Images and Videos. B. Xiong, S. Jain, and K. Grauman. To appear, Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2018. [code-imgs] [code-video] Predicting Foreground Object Ambiguity and Efficiently Crowdsourcing the Segmentation(s). D. Gurari, K. He, B. Xiong, J. Zhang, M. Sameki, S. Jain, S. Sclaroff, M. Betke, and K. Grauman. International Journal of Computer Vision (IJCV), 2018. [link] Pixel Objectness. S. Jain, B. Xiong, and K. Grauman. arXiv. Jan 2017 patent pending FusionSeg: Learning to Combine Motion and Appearance for Fully Automatic Segmentation of Generic Objects in Video. S. Jain, B. Xiong, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, July 2017. [pdf] [DAVIS results leaderboard] patent pending Detangling People: Individuating Multiple Close People and Their Body Parts via Region Assembly. H. Jiang and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, July 2017. (Oral) [pdf] Click Carving: Segmenting Objects in Video with Point Clicks. S. D. Jain and K. Grauman. In Proceedings of the Fourth AAAI Conference on Human Computation and Crowdsourcing (HCOMP), Austin, TX, October 2016. [pdf] Pull the Plug? Predicting If Computers or Humans Should Segment Images. D. Gurari, S. Jain, M. Betke, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, June 2016. [pdf] [supp] Active Image Segmentation Propagation. S. Jain and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, June 2016. [pdf] Which Image Pairs Will Cosegment Well? Predicting Partners for Cosegmentation. S. Jain and K. Grauman. In Proceedings of the Asian Conference on Computer Vision (ACCV), Singapore, Nov 2014. [pdf] Supervoxel-Consistent Foreground Propagation in Video. S. Jain and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland, Sept 2014. [pdf] Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation. S. Jain and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf] Shape Sharing for Segmentation. J. Kim and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Florence, Italy, October 2012. (Oral) [pdf] [supp] Key-Segments for Video Object Segmentation. Y. J. Lee, J. Kim, and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Barcelona, Spain, November 2011. [pdf] Efficient Region Search for Object Detection. S. Vijayanarasimhan and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. [pdf] Boundary Preserving Dense Local Regions. J. Kim and K. Grauman. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2015. [link] Boundary-Preserving Dense Local Regions. J. Kim and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. (Oral) [pdf] Asymmetric Region-to-Image Matching for Comparing Images with Generic Object Categories. J. Kim and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. [pdf] Top-Down Pairwise Potentials for Piecing Together Multi-Class Segmentation Puzzles. S. Vijayanarasimhan and K.Grauman. In Proceedings of the Seventh IEEE Computer Society Workshop on Perceptual Organization in Computer Vision (POCV), June 2010. [pdf] Object-Graphs for Context-Aware Category Discovery. Y. J. Lee and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. (Oral) [pdf] Object-Graphs for Context-Aware Category Discovery. Y. J. Lee and K. Grauman. In IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2011. [link] Collect-Cut: Segmentation with Top-Down Cues Discovered in Multi-Object Images. Y. J. Lee and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. [pdf]

	Activity recognition and video understanding
	Learning Object State Changes in Videos: An Open-World Perspective. Zihui Xue, Kumar Ashutosh, Kristen Grauman. CVPR 2024. [pdf] [project page] Detours for Navigating Instructional Videos. Kumar Ashutosh, Zihui Xue, Tushar Nagarajan, Kristen Grauman. CVPR 2024 (Poster highlight) [pdf] Video-Mined Task Graphs for Keystep Recognition in Instructional Videos. Kumar Ashutosh, Santhosh Kumar Ramakrishnan, Triantafyllos Afouras, Kristen Grauman. NeurIPS 2023. [pdf] EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding. Shuhan Tan, Tushar Nagarajan, Kristen Grauman. NeurIPS 2023 [pdf] What You Say Is What You Show: Visual Narration Detection in Instructional Videos. Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, Kristen Grauman. arXiv 2023. [pdf] Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment. Zihui Xue and Kristen Grauman. NeurIPS 2023. [pdf] SpotEM: Efficient Video Search for Episodic Memory. Santhosh Kumar Ramakrishnan, Ziad Al-Halah, Kristen Grauman. ICML 2023 [pdf] [project] HierVL: Learning Hierarchical Video-Language Embeddings. Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, Kristen Grauman. CVPR 2023. [pdf] [project] Ego4D: Around the World in 3,000 Hours of Egocentric Video. K Grauman et al. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022. (Oral) [pdf] [supp] [project page] Egocentric Activity Recognition and Localization on a 3D Map. M. Liu, L. Ma, K. Somasundaram, Y. Li, K. Grauman, J. Rehg, C. Li. In Proceedings of the European Conference on Computer Vision (ECCV), 2022. [pdf] [project page] Anticipative Video Transformer. R. Girdhar and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Oct 2021. [pdf] Winner of the EPIC-Kitchens CVPR'21 Action Anticipation Challenge Multiview Pseudo-Labeling for Semi-supervised Learning from Video. B. Xiong, H. Fan, K. Grauman, C. Feichtenhofer. In Proceedings of the International Conference on Computer Vision (ICCV), Oct 2021. Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos. Y. Li, T. Nagarajan, B. Xiong, K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021. [pdf] Proposal-based Video Completion. Y-T. Hu, H. Wang, N. Ballas, K. Grauman, A. Schwing. In Proceedings of the European Conference on Computer Vision (ECCV), August 2020. Ego-Topo: Environment Affordances from Egocentric Video. T. Nagarajan, Y. Li, C. Feichtenhofer, K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, June 2020. (Oral) [project page/dataset] [pdf] [supp] You2Me: Inferring Body Pose in Egocentric Video via First and Second Person Interactions. E. Ng, D. Xiang, H. Joo, K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, June 2020. (Oral) [project page/dataset] [pdf] Listen to Look: Action Recognition by Previewing Audio. R. Gao, T-H. Oh, K. Grauman, L. Torresani. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, June 2020. [pdf] Learning Compressible 360 Video Isomers. Y-C. Su and K. Grauman. Transactions on Pattern Analysis and Machine Intelligence (PAMI). Feb 2020. [link] Grounded Human-Object Interaction Hotspots from Video. T. Nagarajan, C. Feichtenhofer, K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, Nov 2019. [pdf] [supp] Co-Separating Sounds of Visual Objects. R. Gao and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, Nov 2019. [pdf] [supp] Learning to Separate Object Sounds by Watching Unlabeled Video. R. Gao, R. Feris, K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, Sept 2018. (Oral) [pdf] [videos] Im2Flow: Motion Hallucination from Static Images for Action Recognition. R. Gao, B. Xiong, and K. Grauman. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, June 2018. (Oral) [pdf] Learning Compressible 360 Video Isomers. Y-C. Su and K. Grauman. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, June 2018. [pdf] [supp] [data] Seeing Invisible Poses: Estimating 3D Body Pose from Egocentric Video. H. Jiang and K. Grauman. To appear, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, July 2017. (Spotlight) [pdf] Leaving Some Stones Unturned: Dynamic Feature Prioritization for Activity Detection in Streaming Video. Y-C. Su and K. Grauman. Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, October 2016. [pdf] [supp] Slow and Steady Feature Analysis: Higher Order Temporal Coherence in Video. D. Jayaraman and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, June 2016. (Spotlight) [pdf] Click Carving: Segmenting Objects in Video with Point Clicks. S. D. Jain and K. Grauman. In Proceedings of the Fourth AAAI Conference on Human Computation and Crowdsourcing (HCOMP), Austin, TX, October 2016. [pdf] Efficient Activity Detection in Untrimmed Video with Max-Subgraph Search. C-Y. Chen and K. Grauman. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI), April 2016. Subjects and Their Objects: Localizing Interactees for a Person-Centric View of Importance. C-Y. Chen and K. Grauman. International Journal of Computer Vision (IJCV), Oct 2016. [link] [arxiv version] Predicting the Location of "Interactees" in Novel Human-Object Interactions. C-Y. Chen and K. Grauman. In Proceedings of the Asian Conference on Computer Vision (ACCV), Singapore, Nov 2014. [pdf] Inferring Unseen Views of People. C.-Y. Chen and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, June 2014. [pdf] Supervoxel-Consistent Foreground Propagation in Video. S. Jain and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland, Sept 2014. [pdf] Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots. C-Y. Chen and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, June 2013. (Oral) [pdf] Active Learning of an Action Detector from Untrimmed Videos. S. Bandla and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf] Active Frame Selection for Label Propagation in Videos. S. Vijayanarasimhan and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Florence, Italy, October 2012. Efficient Activity Detection with Max-Subgraph Search. C.-Y. Chen and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf] Object-Centric Spatio-Temporal Pyramids for Egocentric Activity Recognition. T. McCandless and K. Grauman. In Proceedings of the British Machine Vision Conference (BMVC), Bristol, UK, September 2013. [pdf] Discovering Important People and Objects for Egocentric Video Summarization. Y. J. Lee, J. Ghosh, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf] Key-Segments for Video Object Segmentation. Y. J. Lee, J. Kim, and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Barcelona, Spain, November 2011. [pdf] Learning a Hierarchy of Discriminative Space-Time Neighborhood Features for Human Action Recognition. A. Kovashka and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. [pdf] Far-Sighted Active Learning on a Budget for Image and Video Recognition. S. Vijayanarasimhan, P. Jain, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. [pdf] Observe Locally, Infer Globally: a Space-Time MRF for Detecting Abnormal Activities with Incremental Updates. J. Kim and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, June 2009. [pdf] Watch, Listen & Learn: Co-training on Captioned Images and Videos. S. Gupta, J. Kim, K. Grauman, and R. Mooney. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML), Antwerp, Belgium, September 2008. [pdf] A Task-Driven Intelligent Workspace System to Provide Guidance Feedback. M. S. Ryoo, K. Grauman, and J. K. Aggarwal. Computer Vision and Image Understanding, 2010. [link] Communication via Eye Blinks and Eyebrow Raises: Video-Based Human-Computer Interfaces. K. Grauman, M. Betke, J. Lombardi, J. Gips, and G. Bradski. Universal Access in the Information Society, 2(4) pp. 359-373, Springer-Verlag Heidelberg, November 2003. [link] Communication via Eye Blinks: Detection and Duration Analysis in Real Time. K. Grauman, M. Betke, J. Gips, and G. Bradski. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Lihue, HI, December 2001. [pdf]

	Egocentric perception / first-person vision / embodied AI
	Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos. Mi Luo, Zihui Xue, Alex Dimakis, Kristen Grauman. ECCV 2024 [pdf] 4DIFF: 3D-Aware Diffusion Model for Third-to-First Viewpoint Translation. Feng Cheng, Mi Luo, Huiyu Wang, Alex Dimakis, Lorenzo Torresani, Gedas Bertasius, Kristen Grauman. ECCV 2024 Active Audio-Visual Exploration for Acoustic Environment Modeling. Arjun Somayazulu, Sagnik Majumder, Changan Chen, Kristen Grauman. IROS 2024. Sim2Real Transfer for Audio-Visual Navigation with Frequency-Adaptive Acoustic Field Prediction. Changan Chen, Jordi Ramos Chen, Anshul Tomar, Kristen Grauman. IROS 2024. Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives. Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, .... Michael Wray. CVPR 2024 (Oral) [paper] [supp/appendix] [data/benchmarks] [blog] SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos. Changan Chen, Kumar Ashutosh, Rohit Girdhar, David Harwath, Kristen Grauman. CVPR 2024. [pdf] [project page] Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos. Sagnik Majumder, Ziad Al-Halah, Kristen Grauman. CVPR 2024. [pdf] EgoTracks: A Long-term Egocentric Visual Object Tracking Dataset. Hao Tang, Kevin J Liang, Kristen Grauman, Matt Feiszli, Weiyao Wang. NeurIPS 2023. [pdf] EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding. Shuhan Tan, Tushar Nagarajan, Kristen Grauman. NeurIPS 2023 [pdf] Single-Stage Visual Query Localization in Egocentric Videos. Hanwen Jiang, Santhosh Kumar Ramakrishnan, and Kristen Grauman. NeurIPS 2023. [pdf] Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment. Zihui Xue and Kristen Grauman. NeurIPS 2023. [pdf] EgoEnv: Human-centric environment representations from egocentric video. Tushar Nagarajan, Santhosh Kumar Ramakrishnan, Ruta Desai, James Hillis, Kristen Grauman. NeurIPS 2023 (Oral) [pdf] SpotEM: Efficient Video Search for Episodic Memory. Santhosh Kumar Ramakrishnan, Ziad Al-Halah, Kristen Grauman. ICML 2023 [pdf] [project] Learning to Map Efficiently by Active Echolocation. Xixi Hu, Senthil Purushwalkam, David Harwath, Kristen Grauman. IROS 2023. HierVL: Learning Hierarchical Video-Language Embeddings. Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, Kristen Grauman. CVPR 2023. [pdf] [project] NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory. Santhosh Kumar Ramakrishnan, Ziad Al-Halah, Kristen Grauman. CVPR 2023. [pdf] [project] Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations. Sagnik Majumder, Hao Jiang, Pierre Moulon, Ethan Henderson, Paul Calamia, Kristen Grauman, Vamsi Ithapu. CVPR 2023. [pdf] [project] Egocentric Video Task Translation. Zihui Xue, Yale Song, Kristen Grauman, Lorenzo Torresani. CVPR 2023. (CVPR Highlight paper & winner of Ego4D 2022 "Talking To Me" benchmark challenge) [pdf] [project] Ego4D: Around the World in 3,000 Hours of Egocentric Video. K Grauman et al. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022. (Oral, Best Paper Finalist) [pdf] [supp] [project page] Egocentric Activity Recognition and Localization on a 3D Map. M. Liu, L. Ma, K. Somasundaram, Y. Li, K. Grauman, J. Rehg, C. Li. In Proceedings of the European Conference on Computer Vision (ECCV), 2022. [pdf] [project page] Few-Shot Audio-Visual Learning of Environment Acoustics. Sagnik Majumder, Changan Chen, Ziad Al-Halah, Kristen Grauman. NeurIPS 2022. [pdf] [project page] SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning. Changan Chen, Carl Schissler, Sanchit Garg, Philip Kobernik, Alexander Clegg, Paul Calamia, Dhruv Batra, Philip W Robinson, Kristen Grauman. NeurIPS 2022 [pdf] [project page] Active Audio-Visual Separation of Dynamic Sound Sources. S. Majumder and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), 2022. [pdf] [project page] PONI: Potential Functions for ObjectGoal Navigation with Interaction-Free Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022. (Oral) [pdf] [project page] Zero Experience Required: Plug & Play Modular Transfer Learning for Semantic Visual Navigation. Z. Al-Halah, S. Ramakrishnan, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022. [pdf] [project page] Environment Predictive Coding for Embodied Agents. S. K. Ramakrishnan, T. Nagarajan, Z. Al-Halah, and K. Grauman. In Proceedings of the International Conference on Learning Representations (ICLR), 2022. [pdf] Move2Hear: Active Audio-Visual Source Separation. S. Majumder, Z. Al-Halah, K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Oct 2021. [pdf] [code/videos] DexVIP: Learning Dexterous Grasping with Human Hand Pose Priors from Video. P. Mandikal and K. Grauman. In Conference on Robot Learning (CoRL), 2021. Shaping Embodied Agent Behavior with Activity-context Priors from Egocentric Video. T. Nagarajan and K. Grauman. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), Dec 2021. (spotlight oral). Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos. Y. Li, T. Nagarajan, B. Xiong, K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021. [pdf] Learning Dexterous Grasping with Object-Centric Visual Affordances. P. Mandikal and K. Grauman. In Proceedings of the International Conference on Robotics and Automation (ICRA), 2021. [pdf] [project] Learning to Set Waypoints for Audio-Visual Navigation. C. Chen, S. Majumder, Z. Al-Halah, R. Gao, S. Ramakrishnan, K. Grauman. In Proceedings of the International Conference on Learning Representations (ICLR), May 2021. [pdf] [project] Semantic Audio-Visual Navigation. C. Chen, Z. Al-Halah, K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021. [pdf] [project/video] Audio-Visual Floorplan Reconstruction. S. Purushwalkam, S. V. A. Gari, V. K. Ithapu, C. Schissler, P. Robinson, A. Gupta, K. Grauman. ICCV 2021. [pdf] [project/video] Environment Predictive Coding for Embodied Agents. S. Ramakrishnan, T. Nagarajan, Z. Al-Halah, K. Grauman. arXiv:2102.02337, Feb 2021 [pdf] Learning Affordance Landscapes for Interaction Exploration in 3D Environments. T. Nagarajan and. K. Grauman. In Proceedings of the Advances on Neural Information Processing Systems (NeurIPS), Dec 2020. (Spotlight) [pdf] [project] An Exploration of Embodied Visual Exploration. S. Ramakrishnan, D. Jayaraman, K. Grauman. IJCV 2021. [pdf] [project] [code] SoundSpaces: Audio-Visual Navigation in 3D Environments. C. Chen, U. Jain, C. Schissler, S. V. Amengual Gari, Z. Al-Halah, V. Ithapu, P. Robinson, K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), August 2020. (Spotlight)* Occupancy Anticipation for Efficient Exploration and Navigation. S. Ramakrishnan, Z. Al-Halah, K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), August 2020. (Spotlight) Winner of the 2020 Habitat PointNav Challenge VisualEchoes: Spatial Image Representation Learning through Echolocation. R. Gao, C. Chen, Z. Al-Halah, C. Schissler, K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), August 2020. Ego-Topo: Environment Affordances from Egocentric Video. T. Nagarajan, Y. Li, C. Feichtenhofer, K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, June 2020. (Oral) [project page/dataset] [pdf] [supp] You2Me: Inferring Body Pose in Egocentric Video via First and Second Person Interactions. E. Ng, D. Xiang, H. Joo, K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, June 2020. (Oral) [project page/dataset] [pdf] Grounded Human-Object Interaction Hotspots from Video. T. Nagarajan, C. Feichtenhofer, K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, Nov 2019. [pdf] [supp] Emergence of Exploratory Look-around Behaviors through Active Observation Completion. S. Ramakrishnan, D. Jayaraman, and K. Grauman. Science Robotics, Vol. 4, Issue 30, May 2019. [link] 2.5D Visual Sound. R. Gao and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, June 2019. (Oral) [Best paper award finalist] [pdf] [supp] [FAIR-Play dataset] [videos] [code] Learning to Separate Object Sounds by Watching Unlabeled Video. R. Gao, R. Feris, and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, Sept 2018. (Oral) [pdf] [videos] Sidekick Policy Learning for Active Visual Exploration. S. Ramakrishnan and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, Sept 2018. [pdf] [supp] [videos/code] ShapeCodes: Self-Supervised Feature Learning by Lifting Views to Viewgrids. D. Jayaraman, R. Gao, and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, Sept 2018. [pdf] [supp] End-to-end Policy Learning for Active Visual Categorization. D. Jayaraman and K. Grauman. Transactions on Pattern Analysis and Machine Intelligence (PAMI), May 2018. [pdf] Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks. D. Jayaraman and K. Grauman. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, June 2018. [pdf] Seeing Invisible Poses: Estimating 3D Body Pose from Egocentric Video. H. Jiang and K. Grauman. To appear, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, July 2017. (Spotlight) [pdf] Next-active-object prediction from egocentric videos. A. Furnari, S. Battiato, K. Grauman, and G. Maria Farinella. Journal of Visual Communication and Image Representation. Volume 49, pp. 401-411, November 2017. [link] Learning Image Representations Tied to Egomotion from Unlabeled Video. D. Jayaraman and K. Grauman. International Journal of Computer Vision (IJCV), Special Issue for Best Papers of ICCV 2015, accepted Feb 2017. [pdf] [preprint] Look-Ahead Before You Leap: End-to-End Active Recognition by Forecasting the Effect of Motion. D. Jayaraman and K. Grauman. Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, October 2016. (Oral) [pdf] [supp] Detecting Engagement in Egocentric Video. Y-C. Su and K. Grauman. Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, October 2016. (Oral) [pdf] [supp] Text Detection in Stores Using a Repetition Prior. B. Xiong and K. Grauman. In Proceedings of the IEEE Winter Conference on Computer Vision (WACV). Lake Placid, NY, March 2016. [pdf] Intentional Photos from an Unintentional Photographer: Detecting Snap Points in Egocentric Video with a Web Photo Prior. B. Xiong and K. Grauman. Invited chapter. In Mobile Cloud Visual Media Computing. Springer International Publishing. Editors: G. Hua and X.-S. Hua. pp 85-111. November 2015. [pdf] Learning Image Representations Tied to Ego-Motion. D. Jayaraman and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, Dec 2015. (Oral) [pdf] [supp] Predicting Important Objects for Egocentric Video Summarization. Y. J. Lee and K. Grauman. International Journal on Computer Vision, Volume 114, Issue 1, pp. 38-55, August 2015. [link] [arxiv] Detecting Snap Points in Egocentric Video with a Web Photo Prior. B. Xiong and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland, Sept 2014. [pdf] Story-Driven Summarization for Egocentric Video. Z. Lu and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, June 2013. [pdf] Object-Centric Spatio-Temporal Pyramids for Egocentric Activity Recognition. T. McCandless and K. Grauman. In Proceedings of the British Machine Vision Conference (BMVC), Bristol, UK, September 2013. [pdf] Discovering Important People and Objects for Egocentric Video Summarization. Y. J. Lee, J. Ghosh, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf]

	Learning semantic visual representations / visual attributes
	Densifying Supervision for Fine-Grained Comparisons. A. Yu and K. Grauman. International Journal of Computer Vision (IJCV), Special Issue on Generative Adversarial Networks for Computer Vision, 2020. [pdf] Don't Judge an Object by Its Context: Learning to Overcome Contextual Bias. K. Singh, D. Mahajan, K. Grauman, Y J. Lee, M. Feiszli, D. Ghadiyaram. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, June 2020. (Oral) [pdf] Thinking Outside the Pool: Active Training Image Creation for Relative Attributes. A. Yu and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, June 2019. [pdf] [supp] [code/data] Attributes as Operators. T. Nagarajan and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, Sept 2018. [pdf] [supp] [code] Compare and Contrast: Learning Prominent Visual Differences. S. Chen and K. Grauman. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, June 2018. [pdf] [supp] Semantic Jitter: Dense Supervision for Visual Comparisons via Synthetic Images. A. Yu and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, Oct 2017. [pdf] [supp] Fine-Grained Comparisons with Attributes. A. Yu and K. Grauman. Chapter in Visual Attributes. R. Feris, C. Lampert, and D. Parikh, Editors. Springer. 2017. [pdf] Divide, Share, and Conquer: Multi-task Attribute Learning with Selective Sharing. C-Y. Chen, Dinesh Jayaraman, F. Sha, and K. Grauman. Chapter in Visual Attributes. R. Feris, C. Lampert, and D. Parikh, Editors. Springer. 2017. [pdf] Attributes for Image Retrieval. A. Kovashka and K. Grauman. Chapter in Visual Attributes. R. Feris, C. Lampert, and D. Parikh, Editors. Springer. 2017. Just Noticeable Differences in Visual Attributes. A. Yu and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, Dec 2015. [pdf] [supp] Zero-shot Recognition with Unreliable Attributes. D. Jayaraman and K. Grauman. In Advances in Neural Information Processing Systems (NIPS), Montreal, Canada, Dec 2014. [pdf] [supp] Predicting Useful Neighborhoods for Lazy Local Learning. A. Yu and K. Grauman. In Advances in Neural Information Processing Systems (NIPS), Montreal, Canada, Dec 2014. [pdf] [supp] Discovering Attribute Shades of Meaning with the Crowd. A. Kovashka and K. Grauman. International Journal on Computer Vision (IJCV), Volume 114, Issue 1, pp. 56-73, August 2015. [link] [arxiv] Discovering Shades of Attribute Meaning with the Crowd. A. Kovashka and K. Grauman. Third International Workshop on Parts and Attributes, in conjunction with the European Conference on Computer Vision. Zurich, Switzerland, Sept 2014. [pdf] Fine-Grained Visual Comparisons with Local Learning. A. Yu and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, June 2014. [pdf] Decorrelating Semantic Visual Attributes by Resisting the Urge to Share. D. Jayaraman, F. Sha, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, June 2014. (Oral) [pdf] Inferring Analogous Attributes. C.-Y. Chen and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, June 2014. [pdf] Attribute Adaptation for Personalized Image Search. A. Kovashka and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf] Analogy-Preserving Semantic Embedding for Visual Object Categorization. S. J. Hwang, K. Grauman, and F. Sha. In International Conference on Machine Learning (ICML), Atlanta, GA, June 2013. [pdf] Semantic Kernel Forests from Multiple Taxonomies. S. J. Hwang, K. Grauman, and F. Sha. In Advances in Neural Information Processing Systems (NIPS), Tahoe, Nevada, December 2012. [pdf] Semantic Kernel Forests from Multiple Taxonomies. S. J. Hwang, F. Sha, and K. Grauman. In Big Data Meets Computer Vision: First International Workshop on Large Scale Visual Recognition and Retrieval. In conjunction with NIPS, 2012. [pdf] Discovering Localized Attributes for Fine-grained Recognition. K. Duan, D. Parikh, D. Crandall, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf] Relative Attributes. D. Parikh and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Barcelona, Spain, November 2011. (Oral) [pdf] [Marr Prize, ICCV Best Paper Award] Relative Attributes for Enhanced Human-Machine Communication. D. Parikh, A. Kovashka, A. Parkash, and K. Grauman. Invited paper, Proceedings of AAAI 2012, Sub-Area Spotlights Track for Best Papers. [pdf] Sharing Features Between Objects and Their Attributes. S. J. Hwang, F. Sha, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011. [pdf] Learning with Whom to Share in Multi-task Feature Learning. Z. Kang, K. Grauman, and F. Sha. In Proceedings of the International Conference on Machine Learning (ICML), Bellevue, WA, July 2011. [pdf] Accounting for the Relative Importance of Objects in Image Retrieval. S. J. Hwang and K. Grauman. In Proceedings of the British Machine Vision Conference (BMVC), Aberystwyth, UK, September 2010. (Oral) [pdf] Learning the Relative Importance of Objects from Tagged Images for Retrieval and Cross-Modal Search. S. J. Hwang and K. Grauman. International Journal of Computer Vision (IJCV), Vol. 100, Issue 2, pp. 134-153, November 2012. [link] Learning a Tree of Metrics with Disjoint Visual Features. S. J. Hwang, K. Grauman, F. Sha. In Advances in Neural Information Processing Systems (NIPS). Granada, Spain, December 2011. [pdf]

	Self-supervised feature learning from video
	Environment Predictive Coding for Embodied Agents. S. K. Ramakrishnan, T. Nagarajan, Z. Al-Halah, and K. Grauman. In Proceedings of the International Conference on Learning Representations (ICLR), 2022. [pdf] Im2Flow: Motion Hallucination from Static Images for Action Recognition. R. Gao, B. Xiong, and K. Grauman. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, June 2018. (Oral) [pdf] ShapeCodes: Self-Supervised Feature Learning by Lifting Views to Viewgrids. D. Jayaraman, R. Gao, and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, Sept 2018. [pdf] [supp] Learning Image Representations Tied to Egomotion from Unlabeled Video. D. Jayaraman and K. Grauman. International Journal of Computer Vision (IJCV), Special Issue for Best Papers of ICCV 2015, accepted Feb 2017. [pdf] [preprint] Object-Centric Representation Learning from Unlabeled Videos. R. Gao, D. Jayaraman, and K. Grauman. Proceedings of the Asian Conference on Computer Vision (ACCV), Taipei, November 2016. [pdf] Slow and Steady Feature Analysis: Higher Order Temporal Coherence in Video. D. Jayaraman and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, June 2016. (Spotlight) [pdf] Learning Image Representations Tied to Ego-Motion. D. Jayaraman and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, Dec 2015. (Oral) [pdf] [supp]

	Audio-visual video analysis
	Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos. Changan Chen, Puyuan Peng, Ami Baid, Zihui Xue, Wei-Ning Hsu, David Harwath, Kristen Grauman. ECCV 2024 (Oral) [pdf] [project] SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos. Changan Chen, Kumar Ashutosh, Rohit Girdhar, David Harwath, Kristen Grauman. CVPR 2024. [pdf] [project page] 4DIFF: 3D-Aware Diffusion Model for Third-to-First Viewpoint Translation. Feng Cheng, Mi Luo, Huiyu Wang, Alex Dimakis, Lorenzo Torresani, Gedas Bertasius, Kristen Grauman. ECCV 2024 Active Audio-Visual Exploration for Acoustic Environment Modeling. Arjun Somayazulu, Sagnik Majumder, Changan Chen, Kristen Grauman. IROS 2024. Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos. Sagnik Majumder, Ziad Al-Halah, Kristen Grauman. CVPR 2024. [pdf] Visually-Guided Audio Spatialization in Video with Geometry-Aware Multi-task Learning. Rishabh Garg, Ruohan Gao, Kristen Grauman. International Journal of Computer Vision (IJCV). Vol 131. 2023. [pdf] Self-Supervised Visual Acoustic Matching. Arjun Somayazulu, Changan Chen, Kristen Grauman. NeurIPS 2023. [pdf] Learning to Map Efficiently by Active Echolocation. Xixi Hu, Senthil Purushwalkam, David Harwath, Kristen Grauman. IROS 2023. Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations. Sagnik Majumder, Hao Jiang, Pierre Moulon, Ethan Henderson, Paul Calamia, Kristen Grauman, Vamsi Ithapu. CVPR 2023. [pdf] [project] Novel-View Acoustic Synthesis. Changan Chen, Alexander Richard, Roman Shapovalov, Vamsi Krishna Ithapu, Natalia Neverova, Kristen Grauman, Andrea Vedaldi. CVPR 2023. [pdf] [project] Learning Audio-Visual Dereverberation. Changan Chen, Wei Sun, David Harwath, Kristen Grauman. ICASSP 2023 [pdf] [project] SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning. Changan Chen, Carl Schissler, Sanchit Garg, Philip Kobernik, Alexander Clegg, Paul Calamia, Dhruv Batra, Philip W Robinson, Kristen Grauman. NeurIPS 2022 [pdf] [project page] Few-Shot Audio-Visual Learning of Environment Acoustics. Sagnik Majumder, Changan Chen, Ziad Al-Halah, Kristen Grauman. NeurIPS 2022. [pdf] [project page] Visual Acoustic Matching. C. Chen, R. Gao, P. Calamia, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022. (Oral) [pdf] [project page] Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video. R. Garg, R. Gao, K. Grauman. In Proceedings of the British Machine Vision Conference (BMVC), 2021. (Oral) [Best Paper Award Runner Up]* [pdf] [project page] Move2Hear: Active Audio-Visual Source Separation. S. Majumder, Z. Al-Halah, K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Oct 2021. [pdf] [code/videos] Audio-Visual Floorplan Reconstruction. S. Purushwalkam, S. V. A. Gari, V. K. Ithapu, C. Schissler, P. Robinson, A. Gupta, K. Grauman. ICCV 2021. [pdf] [project/video] VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency. R. Gao and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021. [pdf] [project/video] SoundSpaces: Audio-Visual Navigation in 3D Environments. C. Chen, U. Jain, C. Schissler, S. V. Amengual Gari, Z. Al-Halah, V. Ithapu, P. Robinson, K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), August 2020. (Spotlight) VisualEchoes: Spatial Image Representation Learning through Echolocation. R. Gao, C. Chen, Z. Al-Halah, C. Schissler, K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), August 2020. Listen to Look: Action Recognition by Previewing Audio. R. Gao, T-H. Oh, K. Grauman, L. Torresani. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, June 2020. [pdf] Co-Separating Sounds of Visual Objects. R. Gao and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, Nov 2019. [pdf] [supp] 2.5D Visual Sound. R. Gao and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, June 2019. (Oral) [Best paper award finalist] [pdf] [supp] [FAIR-Play dataset] [videos] [code] Learning to Separate Object Sounds by Watching Unlabeled Video. R. Gao, R. Feris, and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, Sept 2018. (Oral) [pdf] [videos]

	Domain adaptation and transfer learning
	SpotTune: Transfer Learning through Adaptive Fine-tuning. Y. Guo, H. Shi, A. Kumar, K. Grauman, T. Rosing, and R. Feris. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, June 2019. [pdf] Inferring Unseen Views of People. C.-Y. Chen and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, June 2014. [pdf] Inferring Analogous Attributes. C.-Y. Chen and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, June 2014. [pdf] Learning Kernels for Unsupervised Domain Adaptation with Applications to Visual Object Recognition. B. Gong, K. Grauman, and F. Sha. International Journal of Computer Vision (IJCV), Volume 109, Issue 1-2, pp. 3-27, August 2014. [link] Reshaping Visual Datasets for Domain Adaptation. B. Gong, K. Grauman, and F. Sha. In Proceedings of Advances in Neural Information Processing Systems (NIPS), Tahoe, Nevada, December 2013. [pdf] Attribute Adaptation for Personalized Image Search. A. Kovashka and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf] Geodesic Flow Kernel for Unsupervised Domain Adaptation. B. Gong, Y. Shi, F. Sha, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. (Oral) [pdf] [supp] Overcoming Dataset Bias: An Unsupervised Domain Adaptation Approach. B. Gong, F. Sha, and K. Grauman. In Big Data Meets Computer Vision: First International Workshop on Large Scale Visual Recognition and Retrieval. In conjunction with NIPS, 2012. (Oral) [pdf] Connecting the Dots with Landmarks: Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation. B. Gong, K. Grauman, and F. Sha. In International Conference on Machine Learning (ICML), Atlanta, GA, June 2013. (Oral) [pdf] [supp] Relative Attributes. D. Parikh and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Barcelona, Spain, November 2011. (Oral) [pdf] [Marr Prize, ICCV Best Paper Award]

	Video summarization
	Less is More: Learning Highlight Detection from Video Duration. B. Xiong, Y. Kalantidis, D. Ghadiyaram, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, June 2019. [pdf] [supp] [videos] Retrospective Encoders for Video Summarization. K. Zhang, K. Grauman, F. Sha. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, Sept 2018. [pdf] [supp] Making 360 Video Watchable in 2D: Learning Videography for Click Free Viewing. Y-C. Su and K. Grauman. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, July 2017. (Spotlight) [pdf] Pano2Vid: Automatic Cinematography for Watching 360◦ Videos. Y-C. Su, D. Jayaraman, and K. Grauman. Proceedings of the Asian Conference on Computer Vision (ACCV), Taipei, November 2016. (Oral, Best Application Paper Award) [pdf] [supp] Video Summarization with Long Short-term Memory. K. Zhang, W-L. Chao, F. Sha, and K. Grauman. Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, October 2016. [pdf] [supp] Detecting Engagement in Egocentric Video. Y-C. Su and K. Grauman. Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, October 2016. (Oral) [pdf] [supp] Summary Transfer: Exemplar-based Subset Selection for Video Summarization. K. Zhang, W-L. Chao, F. Sha, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, June 2016. [pdf] [supp] Large-Margin Determinantal Point Processes. W-L. Chao, B. Gong, K. Grauman, and F. Sha. In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), Amsterdam, Netherlands, July 2015. [pdf] [supp] Intentional Photos from an Unintentional Photographer: Detecting Snap Points in Egocentric Video with a Web Photo Prior. B. Xiong and K. Grauman. Invited chapter. In Mobile Cloud Visual Media Computing. Springer International Publishing. Editors: G. Hua and X.-S. Hua. pp 85-111. November 2015. [pdf] Predicting Important Objects for Egocentric Video Summarization. Y J. Lee and K. Grauman. International Journal on Computer Vision (IJCV). Volume 114, Issue 1, pp. 38-55, August 2015. [link] [arxiv] Diverse Sequential Subset Selection for Supervised Video Summarization. B. Gong, W. Chao, K. Grauman, and F. Sha. In Advances in Neural Information Processing Systems (NIPS), Montreal, Canada, Dec 2014. [pdf] Detecting Snap Points in Egocentric Video with a Web Photo Prior. B. Xiong and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland, Sept 2014. [pdf] Story-Driven Summarization for Egocentric Video. Z. Lu and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, June 2013. [pdf] Discovering Important People and Objects for Egocentric Video Summarization. Y. J. Lee, J. Ghosh, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf]

	Fashion image analysis
	From Culture to Clothing: Discovering the World Events Behind A Century of Fashion Images. W-L. Hsiao and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Oct 2021 (oral). [pdf] [project page] Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback. H. Wu, Y. Gao, X. Guo, Z. Al-Halah, S. Rennie, K. Grauman, R. Feris. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021. [pdf] From Culture to Clothing: Discovering the World Events Behind A Century of Fashion Images. W-L. Hsiao, K. Grauman. arXiv:2102.01690, Feb 2021 [pdf] Discovering Underground Maps from Fashion. U. Mall, K. Bala, T. Berg, K. Grauman. WACV 2022 [pdf] Modeling Fashion Influence from Photos, Z. Al-Halah and K. Grauman. IEEE Transactions on Multimedia, Nov 2020. [pdf] [project] [code] ViBE: Dressing for Diverse Body Shapes. W-L. Hsiao and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, June 2020. (Oral) [project page] [pdf] [supp] From Paris to Berlin: Discovering Fashion Style Influences Around the World. Z. Al-Halah and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, June 2020. [pdf] [supp] Densifying Supervision for Fine-Grained Comparisons. A. Yu and K. Grauman. International Journal of Computer Vision (IJCV), Special Issue on Generative Adversarial Networks for Computer Vision, 2020. [pdf] Fashion++: Minimal Edits for Outfit Improvement. W-L. Hsiao, I. Katsman, C-Y. Wu, D. Parikh, K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Seoul, Korea, Nov 2019. [pdf] [supp] Thinking Outside the Pool: Active Training Image Creation for Relative Attributes. A. Yu and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, June 2019. [pdf] [supp] [code/data] BrowseWithMe: An Online Clothes Shopping Assistant for People with Visual Impairments. A. Stangl, E. Kothari, S. Jain, T. Yeh, K. Grauman, D. Gurari. In Proceedings of The 20th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS), Galway, Ireland, Oct 2018. [pdf] [video demo] Creating Capsule Wardrobes from Fashion Images. W-L. Hsiao and K. Grauman. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, June 2018. (Spotlight) [pdf] [supp] Learning the Latent "Look": Unsupervised Discovery of a Style-Coherent Embedding from Fashion Images. W-L. Hsiao and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, Oct 2017. [pdf] Fashion Forward: Forecasting Visual Style in Fashion. Z. Al-Halah, R. Stiefelhagen, and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, Oct 2017. [pdf] [supp] WhittleSearch: Interactive Image Search with Relative Attribute Feedback. A. Kovashka, D. Parikh, and K. Grauman. International Journal on Computer Vision (IJCV), Volume 115, Issue 2, pp 185-210, November 2015. [link] [arxiv] Just Noticeable Differences in Visual Attributes. A. Yu and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, Dec 2015. [pdf] [supp] WhittleSearch: Image Search with Relative Attribute Feedback. A. Kovashka, D. Parikh, and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, June 2012. [pdf] [supp] [patented] Attribute Pivots for Guiding Relevance Feedback in Image Search. A. Kovashka and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf] [patented] Fine-Grained Visual Comparisons with Local Learning. A. Yu and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, June 2014. [pdf] Attribute Adaptation for Personalized Image Search. A. Kovashka and K. Grauman. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. [pdf] Discovering Attribute Shades of Meaning with the Crowd. A. Kovashka and K. Grauman. International Journal on Computer Vision (IJCV), Volume 114, Issue 1, pp. 56-73, August 2015. [link] [arxiv]

	360 Images and Video
	Learning Spherical Convolution for 360 Recognition. Y-C. Su and K. Grauman. Transactions on Pattern Analysis and Machine Intelligence (PAMI), Sept 2021. [link] Learning Compressible 360 Video Isomers. Y-C. Su and K. Grauman. To appear, Transactions on Pattern Analysis and Machine Intelligence (PAMI). Feb 2020. Kernel Transformer Networks for Compact Spherical Convolution. Y-C. Su and K. Grauman. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, June 2019. [pdf] [supp] [code] Snap Angle Prediction for 360 Panoramas. B. Xiong and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, Sept 2018. [pdf] [supp] Learning Compressible 360 Video Isomers. Y-C. Su and K. Grauman. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, June 2018. [pdf] [supp] [data] Learning Spherical Convolution for Fast Features from 360° Imagery. Y-C. Su and K. Grauman. In Advances in Neural Information Processing (NIPS), Long Beach, CA, Dec 2017. [pdf] [supp] [code/models] Making 360 Video Watchable in 2D: Learning Videography for Click Free Viewing. Y-C. Su and K. Grauman. To appear, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, July 2017. (Spotlight) [pdf] Pano2Vid: Automatic Cinematography for Watching 360◦ Videos. Y-C. Su, D. Jayaraman, and K. Grauman. Proceedings of the Asian Conference on Computer Vision (ACCV), Taipei, November 2016. (Oral, Best Application Paper Award) [pdf] [supp]

	Vision and language
	What You Say Is What You Show: Visual Narration Detection in Instructional Videos. Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, Kristen Grauman. arXiv 2023. [pdf] Video-Mined Task Graphs for Keystep Recognition in Instructional Videos. Kumar Ashutosh, Santhosh Kumar Ramakrishnan, Triantafyllos Afouras, Kristen Grauman. NeurIPS 2023. [pdf] HierVL: Learning Hierarchical Video-Language Embeddings. Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, Kristen Grauman. CVPR 2023. [pdf] [project] NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory. Santhosh Kumar Ramakrishnan, Ziad Al-Halah, Kristen Grauman. CVPR 2023. [pdf] [project] SpotEM: Efficient Video Search for Episodic Memory. Santhosh Kumar Ramakrishnan, Ziad Al-Halah, Kristen Grauman. ICML 2023 [pdf] [project] Attributes as Operators. T. Nagarajan and K. Grauman. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, Sept 2018. [pdf] [supp] [code] Visual Question Answer Diversity. C-J. Yang, K. Grauman, and D. Gurari. In Proceedings of the Sixth AAAI Conference on Human Computation and Crowdsourcing (HCOMP), Zurich, July 2018. [pdf] VizWiz Grand Challenge: Answering Visual Questions from Blind People. D. Gurari, Q. Li, A. Stangl, A. Guo, C. Lin, K. Grauman, J. Luo, and J. Bigham. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, June 2018. (Spotlight) [pdf] [supp] Attributes for Image Retrieval. A. Kovashka and K. Grauman. Chapter in Visual Attributes. R. Feris, C. Lampert, and D. Parikh, Editors. Springer. 2017.
	Other topics
	Learning Patterns of Tourist Movement and Photography from Geotagged Photos at Archaeological Heritage Sites in Cuzco, Peru. N. Payntar, W-L. Hsiao, A. Covey, K. Grauman. To appear, ACM Journal of Tourism Management, 2020. [arXiv] Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion. Z. Yang, J. Pan, L. Luo, X. Zhou, K. Grauman, and Q. Huang. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, June 2019. (Oral) [pdf] [supp] [code] VizWiz Grand Challenge: Answering Visual Questions from Blind People. D. Gurari, Q. Li, A. Stangl, A. Guo, C. Lin, K. Grauman, J. Luo, and J. Bigham. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, June 2018. (Spotlight) [pdf] [supp] Visual Question Answer Diversity. C-J. Yang, K. Grauman, and D. Gurari. In Proceedings of the Sixth AAAI Conference on Human Computation and Crowdsourcing (HCOMP), Zurich, July 2018. [pdf] BlockDrop: Dynamic Inference Paths in Residual Networks. Z. Wu, T. Nagarajan, A. Kumar, S. Rennie, L. Davis, K. Grauman, R. Feris. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, June 2018. (Spotlight) [pdf] [supp] [code] On-Demand Learning for Deep Image Restoration. R. Gao and K. Grauman. In Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, Oct 2017. [pdf] CrowdVerge: Predicting If People Will Agree on the Answer to a Visual Question. D. Gurari and K. Grauman. ACM Conference on Human Factors in Computing Systems (CHI), Denver, CO, May 2017. Best Paper Honorable Mention Award [pdf] Text Detection in Stores Using a Repetition Prior. B. Xiong and K. Grauman. In Proceedings of the IEEE Winter Conference on Computer Vision (WACV). Lake Placid, NY, March 2016. [pdf] Predicting Useful Neighborhoods for Lazy Local Learning. A. Yu and K. Grauman. In Advances in Neural Information Processing Systems (NIPS), Montreal, Canada, Dec 2014. [pdf] [supp] Visual Object Recognition, Kristen Grauman and Bastian Leibe, Synthesis Lectures on Artificial Intelligence and Machine Learning, April 2011, Vol. 5, No. 2, Pages 1-181. [link] Reconstructing a Fragmented Face from a Cryptographic Identiﬁcation Protocol. A. Luong, M. Gerbush, B. Waters, and K. Grauman. In Proceedings of the IEEE Workshop on Applications of Computer Vision (WACV), Clearwater Beach, FL, January 2013. [pdf] Avoiding the ``Streetlight Effect'': Tracking by Exploring Likelihood Modes. D. Demirdjian, L. Taycher, G. Shakhnarovich, K. Grauman, and T. Darrell. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Beijing, China, October 2005. [pdf] Virtual Visual Hulls: Example-Based 3D Shape Inference from a Single Silhouette. K. Grauman, G. Shakhnarovich, and T. Darrell. In Proceedings of the 2nd Workshop on Statistical Methods in Video Processing, in conjunction with ECCV, Prague, Czech Republic, May 2004. [pdf] Inferring 3D Structure with a Statistical Image-Based Shape Model. K. Grauman, G. Shakhnarovich, and T. Darrell. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Nice, France, October 2003. [pdf] A Bayesian Approach to Image-Based Visual Hull Reconstruction. K. Grauman, G. Shakhnarovich, and T. Darrell. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Madison, WI, June 2003. [pdf]

Learning the Relative Importance of Objects from Tagged Images for Retrieval and Cross-Modal Search. S. J. Hwang and K. Grauman. International Journal of Computer Vision (IJCV), published online October 2011. [link]

Learning the Relative Importance of Objects from Tagged Images for Retrieval and Cross-Modal Search. S. J. Hwang and K. Grauman. International Journal of Computer Vision (IJCV), Vol. 100, Issue 2, pp. 134-153, November 2012. [link]