UTCS Artificial Intelligence
courses
talks/events
demos
people
projects
publications
software/data
labs
areas
admin
Generating Natural-Language Video Descriptions Using Text-Mined Knowledge (2013)
Niveda Krishnamoorthy
,
Girish Malkarnenkar
,
Raymond J. Mooney
, Kate Saenko, Sergio Guadarrama
We present a holistic data-driven technique that generates natural-language descriptions for videos. We combine the output of state-of-the-art object and activity detectors with "real-world" knowledge to select the most probable subject-verb-object triplet for describing a video. We show that this knowledge, automatically mined from web-scale text corpora, enhances the triplet selection algorithm by providing it contextual information and leads to a four-fold increase in activity identification. Unlike previous methods, our approach can annotate arbitrary videos without requiring the expensive collection and annotation of a similar training video corpus. We evaluate our technique against a baseline that does not use text-mined knowledge and show that humans prefer our descriptions 61 percent of the time.
View:
PDF
Citation:
In
Proceedings of the 27th AAAI Conference on Artificial Intelligence (AAAI-2013)
, pp. 541--547, July 2013.
Bibtex:
@inproceedings{krishnamoorthy:aaai13, title={Generating Natural-Language Video Descriptions Using Text-Mined Knowledge}, author={Niveda Krishnamoorthy and Girish Malkarnenkar and Raymond J. Mooney and Kate Saenko and Sergio Guadarrama}, booktitle={Proceedings of the 27th AAAI Conference on Artificial Intelligence (AAAI-2013)}, month={July}, pages={541--547}, url="http://www.cs.utexas.edu/users/ai-lab?krishnamoorthy:aaai13", year={2013} }
Presentation:
Slides (PPT)
People
Niveda Krishnamoorthy
Masters Alumni
niveda [at] cs utexas edu
Girish Malkarnenkar
Masters Alumni
girish [at] cs utexas edu
Raymond J. Mooney
Faculty
mooney [at] cs utexas edu
Areas of Interest
Computer Vision
Language and Vision
Machine Learning
Natural Language Processing
Labs
Machine Learning