Forum for Artificial Intelligence

Archive


Forum for Artificial Intelligence

[ Home   |   About FAI   |   Upcoming talks   |   Past talks ]



This website is the archive for past Forum for Artificial Intelligence talks. Please click this link to navigate to the list of current talks.

FAI meets every other week (or so) to discuss scientific, philosophical, and cultural issues in artificial intelligence. Both technical research topics and broader inter-disciplinary aspects of AI are covered, and all are welcome to attend!

If you would like to be added to the FAI mailing list, subscribe here. If you have any questions or comments, please send email to Catherine Andersson.






[ Upcoming talks ]





Mon, August 25
3:00PM
Alexander Koller
Saarland University
Generation as planning
Fri, August 29
11:00AM
Sameer S. Pradhan
BBN Technologies
OntoNotes: A Unified Relational Semantic Representation
Thu, September 11
3:00PM
Chad Jenkins
Brown University
Learning in Human-Robot Teams
Fri, September 12
11:00AM
Dr. Jesse Davis
University of Washington
Statistical Relational Learning, Diagnosing Breast Cancer and Transfer Learning
Fri, September 26
2:00PM
Bart Selman
Cornell University
The Synthesis of Probabilistic and Logical Inference Methods
Fri, October 10
11:00AM
Yoonsuck Choe
Texas A&M University
Motor System's Role in Grounding, Development, and Recognition in Vision
Fri, October 17
11:00AM
Rebecca Hwa
University of Pittsburgh
Learning Evaluation Metrics for Sentence-Level Machine-Translation
Mon, October 20
11:00AM
Judea Pearl
University of California, Los Angeles
The Foundations of Causal Inference
Mon, October 20
1:00PM
Bryan Pardo
Northwestern University
Teaching Machines to Listen
Fri, October 31
11:00AM
Andrew Gordon
The University of Southern California
Reasoning with millions of stories collected from Internet weblogs
Fri, November 7
11:00AM
Hod Lipson
Cornell University
Mining experimental data for dynamical invariants - from cognitive robotics to computational biology
Mon, November 10
11:00AM
Hal Daumé
Univeristy of Utah
Domain adaptation in natural language
Mon, February 9
10:00AM
Claudia Perlich
IBM T.J. Watson Research Center
Breast Cancer Identification: KDD CUP Winner's Report
Fri, February 13
11:00AM
Zenzi Griffin
The University of Texas at Austin
How speakers' eye movements reflect spoken language generation
Fri, February 20
11:00AM
Lillian Lee
Cornell University
A tempest: Or, On the flood of interest in sentiment analysis, opinion mining, and the computational treatment of subjective language
Fri, February 27
11:00AM
Kristen Grauman
University of Texas at Austin
Efficient Visual Search and Learning
Fri, March 6
11:00AM
Pedro F. Felzenszwalb
University of Chicago
Object Recognition with Part-Based Models
Mon, March 9
3:00PM
Rada Mihalcea
University of North Texas
Linking Documents to Encyclopedic Knowledge: Using Wikipedia as a Source of Linguistic Evidence
Wed, March 25
11:00AM
Marsette Vona
MIT
Virtual Articulations for Coordinated Motion in High-DoF Robots
Thu, March 26
10:00AM
Lenhart K. Schubert
University of Rochester
Towards Generic Knowledge Acquisition from Text
Fri, April 3
11:00AM
Joohyung Lee
Arizona State University
Circumscriptive Event Calculus as Answer Set Programming
Tue, April 14
11:00AM
Daphne Koller
Stanford University
Probabilistic Models for Holistic Scene Understanding
Mon, April 27
11:00AM
Cordelia Schmid
INRIA
Learning visual human actions from movies
Fri, May 29
11:00AM
Matthew Taylor
University of Southern California
Balancing Multi-agent Exploration and Exploitation in Time-Critical Domains
Mon, June 22
11:00AM
Shimon Whiteson
University of Amsterdam
Multi-Agent Reinforcement Learning for Urban Traffic Control using Coordination Graphs

Monday, August 25, 2008, 3:00PM



Generation as planning

Alexander Koller   [homepage]

Saarland University

Joint work with Matthew Stone and Ron Petrick.

The problem of natural language generation is intimately related to AI planning on many levels. In both problems, the computer has to search for a sequence of actions that combine in appropriate ways to achieve a given goal; in the case of generation, these actions may correspond to uttering speech acts, sentences, or individual words. This has been recognized in the literature for several decades, but there is currently a revival of interest in exploring this connection, which has been sparked especially by the recent efficiency improvements in planning.

In my talk, I will first show how sentence generation can be translated into a planning problem. This has the advantage that the (somewhat artificial) separation of sentence generation into microplanning and surface realization can be overcome. Furthermore, each plan action captures the complete grammatical, semantic, and pragmatic preconditions and effects of uttering a single word. I will then present a new shared task for the generation community, in which the system must generate instructions in a virtual environment. I will discuss some problems that arise in this application -- particularly regarding the use of extralinguistic context in generation -- and propose some ideas on how they can be tackled using a planning approach.

Friday, August 29, 2008, 11:00AM



OntoNotes: A Unified Relational Semantic Representation

Sameer S. Pradhan   [homepage]

BBN Technologies

The OntoNotes project is creating a corpus of large-scale, accurate, and integrated annotation of multiple levels of the shallow semantic structure in text. Such rich, integrated annotation covering many levels will allow for richer, cross-level models enabling significantly better automatic semantic analysis. At the same time, it demands a robust, efficient, scalable mechanism for storing and accessing these complex inter-dependent annotations. We describe a relational database representation that captures both the inter- and intra-layer dependencies and provide details of an object-oriented API for efficient, multi-tiered access to this data.

The OntoNotes project is funded by DARPA under the GALE program and is a collaborative effort between BBN Technologies, University of Colorado, University of Pennsylvania, and Information Sciences Institute at the University of Southern California.

About the speaker:

Sameer Pradhan completed his PhD in Computer Science at the University of Colorado, Boulder in 2005 under the guidance of Profs. Wayne Ward, James H. Martin, and Daniel Jurafsky. He is currently working as a Scientist at BBN Technologies . His prior academic pursuits include a Master of Science in Industrial Engineering from Alfred University, NY and a Bachelor's degree in Production Engineering from Victoria Jubilee Technical Institute, Bombay.

Thursday, September 11, 2008, 3:00PM



Learning in Human-Robot Teams

Chad Jenkins   [homepage]

Brown University

A principal goal of robotics is to realize embodied systems that are effective collaborators for human endeavors in the physical world. Human-robot collaborations can occur in a variety of forms, including autonomous robotic assistants, mixed-initiative robot explorers, and augmentations of the human body. For these collaborations to be effective, human users must have the ability to realize their intended behavior into actual robot control policies. At run-time, robots should be able to manipulate an environment and engage in two-way communication in a manner suitable to their human users. Further, the tools for programming, communicating with, and manipulating using robots should be accessible to the diverse sets of technical abilities present in society.

Towards the goal of effective human-robot collaboration, our research has pursued the use of learning and data-driven approaches to robot programming, communication, and manipulation. Learning from demonstration (LfD) has emerged as a central theme of our efforts towards natural instruct of autonomous robots by human users. In robot LfD, the desired robot control policy is implicit in human demonstration rather than explicitly coded in a computer program.

In this talk, I will describe our LfD-based work in policy learning using Gaussian Process Regression and humanoid imitation learning through spatio-temporal dimension reduction. This work is supported by our efforts in markerless, inertial-based, and physics-based human kinematic tracking, notably our indoor-outdoor person following system developed in collaboration with iRobot Research. I will additionally argue that collaboration in human-robot teams can be modeled by Markov Random Fields (MRFs), allowing for unification of existing multi-robot algorithms, application of belief propagation, and faithful modeling of experimental findings from cognitive science. Time permitting, I will also discuss our work learning tactile and force signatures to distinguish successful versus unsuccessful grasping on the NASA Robonaut.

About the speaker:

Odest Chadwicke Jenkins, Ph.D., is an Assistant Professor of Computer Science at Brown University. Prof. Jenkins earned his B.S. in Computer Science and Mathematics at Alma College (1996), M.S. in Computer Science at Georgia Tech (1998), and Ph.D. in Computer Science at the University of Southern California (2003). In 2007, he received Young Investigator funding from the Office of Naval Research and the Presidential Early Career Award for Scientists and Engineers (PECASE) for his work in learning primitive models of human motion for humanoid robot control and kinematic tracking.

Friday, September 12, 2008, 11:00AM



Statistical Relational Learning, Diagnosing Breast Cancer and Transfer Learning

Dr. Jesse Davis   [homepage]

University of Washington

Standard inductive learning makes two key assumptions about the structure of the data. First, it requires that all examples are independent and identically distributed (iid). Second, it requires that the training and test instances come from the same distribution. Decades of research have produced many sophisticated techniques for solving this task. Unfortunately, in real applications these assumptions are often violated. In the first part of this talk, I will motivate the need to handle non-iid data through the concrete task of predicting whether an abnormality on a mammogram is malignant. I will describe the SAYU algorithm, which automatically constructs relational features. Our system makes significantly more accurate predictions than both radiologists and other machine learning techniques on this task. Furthermore, we identified a novel feature that is indicative of malignancy. In the second part of this talk, I will discuss a transfer learning algorithm that removes both restrictions made by standard inductive learners. In shallow transfer, test instances are from the same domain, but have a different distribution. In deep transfer, test instances are from a different domain entirely (i.e., described by different predicates). Humans routinely perform deep transfer, but few learning systems are capable of it. I will describe an approach based on a form of second-order Markov logic, which discovers structural regularities in the source domain in the form of Markov logic formulas with predicate variables, and instantiates these formulas with predicates from the target domain. Using this approach, we have successfully transferred learned knowledge between a molecular biology domain and a Web one. The discovered patterns include broadly useful properties of predicates, like symmetry and transitivity, and relations among predicates, like various forms of homophily.

About the speaker:

Jesse Davis is a post-doctoral researcher at the University of Washington. He received his Ph.D in computer science at the University of Wisconsin - Madison in 2007 and a B.A. in computer science from Williams College in 2002. His research interests include statistical relational learning, transfer learning, inductive logic programming and data mining for biomedical domains.

Friday, September 26, 2008, 2:00PM



The Synthesis of Probabilistic and Logical Inference Methods

Bart Selman   [homepage]

Cornell University

In recent years, constraint reasoning methods have improved dramatically. Up till the early nineties, general constraint reasoning beyond hundred variable problems appeared infeasible. Since then, we have witness a qualitative change in the field: current reasoning engines can handle problems with over a million variables and several millions of constraints. I will discuss what led to such a dramatic scale-up, emphasizing recent advances on combining probabilistic and logical inference techniques. I will also describe how progress in reasoning technology has opened up a range of new applications in AI and computer science in general.

About the speaker:

Bart Selman is a professor of computer science at Cornell University. His research interests include efficient reasoning procedures, planning, knowledge representation, and connections between computer science and statistical physics. He has (co-)authored over 100 papers, which have appeared in venues spanning Nature, Science, Proc. Natl. Acad. of Sci., and a variety of conferences and journals in AI and Computer Science. He has received six Best Paper Awards, and is an Alfred P. Sloan Research Fellowship recipient, a Fellow of AAAI, and a Fellow of AAAS.

Friday, October 10, 2008, 11:00AM



Motor System's Role in Grounding, Development, and Recognition in Vision

Yoonsuck Choe   [homepage]

Texas A&M University

Vision is basically a sensory modality, so it is no surprise that the investigation into the brain's visual functions has been focused on its sensory aspect. Thus, questions like (1) how can external geometric properties represented in the internal state of the visual system be grounded, (2) how do the visual cortical receptive fields (RFs) form, and (3) how can visual shapes be recognized have all been addressed within the framework of sensory information processing. However, this view is being challenged on multiple fronts, with an increasing emphasis on the motor aspect of visual function. In this talk, I will review works that implicate the important role of motor function in vision, and discuss our latest results touching upon the issues of grounding, RF development, and shape recognition. Our main findings are that (1) motor primitives play a fundamental role in grounding, (2) RF learning can be biased and enhanced by the motor system, and (3) shape recognition is easier with motor-based representations than with sensor-based representations. The insights we gained here will help us better understand visual cortical function. Also, we expect the motor-oriented view of visual cortical function to be generalizable to other sensory cortices such as somatosensory and auditory cortex.

About the speaker:

Yoonsuck Choe is an associate professor of Computer Science and the director of the Brain Networks Laboratory at Texas A&M University. He received his B.S. degree from Yonsei University, Korea (1993), and his M.A. and Ph.D. from the University of Texas at Austin (1995, 2001, respectively), all in Computer Science. His research interests are in computational neuroscience, computational neuroanatomy, biologically inspired vision, and neural networks.

Friday, October 17, 2008, 11:00AM



Learning Evaluation Metrics for Sentence-Level Machine-Translation

Rebecca Hwa   [homepage]

University of Pittsburgh

The field of machine translation (MT) has made major strides in recent years. An important enabling factor has been the adoption of automatic evaluation metrics to guide researchers in making improvements to their systems. Research in automatic evaluation metrics faces two major challenges. One is to achieve higher agreement with human judgments when evaluating MT outputs at the sentence-level. Another is to minimize the reliance on expensive, human-developed resources such as reference sentences. In this talk, I present a regression-based approach to metric development. Our experiments suggest that by combining a wide range of features, the resulting metric has higher correlations with human judgments. Moreover, we show that the features do not have to be extracted from comparisons with human produced references. Using weaker indicators of fluency and adequacy, our learned metrics rival standard reference-based metrics in terms of correlations with human judgments on new test instances.

About the speaker:

Rebecca Hwa is an Assistant Professor in the Department of Computer Science at the University of Pittsburgh. Her research focus is on multilingual processing, machine translation, and semi-supervised learning methods. Before joining Pitt, she was a postdoctoral research fellow at the University of Maryland. She received her PhD from Harvard University and her B.S. from UCLA.

Monday, October 20, 2008, 11:00AM



The Foundations of Causal Inference

Judea Pearl   [homepage]

University of California, Los Angeles

I will review concepts, principles, and mathematical tools that were found useful in applications involving causal inference. The principles are based on structural-model semantics, in which functional (or counterfactual) relations represent physical processes. This semantical framework, enriched with a few ideas from logic and graph theory, gives rise to a complete, coherent, and friendly calculus of causation that unifies the structural, graphical and potential-outcome approaches to causation and resolves long-standing problems in several of the empirical sciences. These include questions of confounding, causal effect estimation, policy analysis, legal responsibility, effect decomposition, instrumental variables, and the integration of data from diverse studies.

About the speaker:

Judea Pearl is a professor of computer science and statistics at the University of California, Los Angeles. He joined the faculty of UCLA in 1970, where he currently directs the Cognitive Systems Laboratory and conducts research in artificial intelligence, human reasoning and philosophy of science. He has authored three books: Heuristics (1984), Probabilistic Reasoning (1988), and Causality (2000). A member of the National Academy of Engineering, and a Founding Fellow the American Association for Artificial Intelligence (AAAI), Judea Pearl is the recipient of the IJCAI Research Excellence Award for 1999, the London School of Economics Lakatos Award for 2001, the ACM Alan Newell Award for 2004, and the 2008 Benjamin Franklin Medal of Computer and Cognitive Science from the Franklin Institute.

Monday, October 20, 2008, 1:00PM



Teaching Machines to Listen

Bryan Pardo   [homepage]

Northwestern University

Music collections comprise one of the most popular categories of online multimedia content, as evidenced by the millions of recordings available in online repositories such as Emusic, Yahoo! Music, Rhapsody and Apple's iTunes. These vast online collections let the average person access and hear more music than was possible for even music scholars only a few years ago. Of course, finding a music document is only the beginning - a step to initiate the task at hand. Bryan Pardo and his students in the Northwestern University Interactive Audio Lab develop key technologies that let composers, researchers, performers and casual listeners retrieve, study, edit and interact with audio in new ways. This talk will provide an overview of recent work in the lab. Projects include: a music search engine that finds a song from a melody sung to the computer (audio database search); a cell phone based karaoke game (social computing); a system that learns to recognize sounds from an audio mixture and uses its learned knowledge to label new recordings (machine learning and source identification); a system to separate out individual audio sources from a mixture of sounds (source separation); and a system to automatically personalize the user interface of audio production software by mapping human descriptors onto acoustic features (human computer interaction).

About the speaker:

Bryan Pardo is an assistant professor in the Northwestern University Department of Electrical Engineering and Computer Science, with appointments in the Music Cognition program and the Center for Technology and Social Behavior. Prof. Pardo received a M. Mus. in Jazz in 2001 and a Ph.D. in Computer Science in 2005, both from the University of Michigan. He has developed speech software for the Speech and Hearing department of the Ohio State University, statistical software for SPSS and worked as a machine learning researcher for General Dynamics. While finishing his doctorate, he taught in the Music Department of Madonna University. When he's not programming, writing or teaching, he performs throughout the United States on saxophone and clarinet at venues such as Albion College, the Chicago Cultural Center and the Detroit Concert of Colors.

Friday, October 31, 2008, 11:00AM



Reasoning with millions of stories collected from Internet weblogs

Andrew Gordon   [homepage]

The University of Southern California

The capacity for for automated open-domain reasoning about events in the everyday world is central to a wide range of problems in Artificial Intelligence, but is also one of AI most obvious failures. No systems exist today that can predict what might happen next, explain how things got this way, or imagine what it would be like if something were different, except within the context of very narrow, well-formalized domains. The phenomenal rise of Internet weblogs, where millions of people tell stories about the events of their everyday lives, offers new opportunities to make progress on this core AI challenge. In this talk, I will discuss our efforts to automatically extract millions of stories from Internet weblogs, and to use this text corpus as a case base for automated open-domain reasoning about events in the everyday world.

About the speaker:

Andrew Gordon is a Research Scientist and Research Associate Professor at the University of Southern California's Institute for Creative Technologies. He is the author of the 2004 book, Strategy Representation: An Analysis of Planning Knowledge. He received his Ph.D. in Computer Science from Northwestern University in 1999.

Friday, November 7, 2008, 11:00AM



Mining experimental data for dynamical invariants - from cognitive robotics to computational biology

Hod Lipson   [homepage]

Cornell University

This talk will describe new active learning processes for automated modeling of dynamical systems across a number of disciplines. One of the long-standing challenges in robotics is achieving robust and adaptive performance under uncertainty. The talk will describe an approach to adaptive behavior based on self-modeling, where a system continuously evolves multiple simulators of itself in order to make useful predictions. The robot is rewarded for actions that cause disagreement among predictions of different candidate simulators, thereby elucidating uncertainties. The concept of self modeling will then be generalized to other systems, demonstrating how analytical invariants can be derived automatically for physical systems purely from observation. Application to modeling physical and biological systems will be shown.

About the speaker:

Hod Lipson is an Associate Professor of Mechanical & Aerospace Engineering and Computing & Information Science at Cornell University in Ithaca, NY. He directs the Computational Synthesis group, which focuses on novel ways for automatic design, fabrication and adaptation of virtual and physical machines. He has led work in areas such as evolutionary robotics, multi-material functional rapid prototyping, machine self-replication and programmable self-assembly. Lipson received his Ph.D. from the Technion - Israel Institute of Technology in 1998, and continued to a postdoc at Brandeis University and MIT. His research focuses primarily on biologically-inspired approaches, as they bring new ideas to engineering and new engineering insights into biology.

Monday, November 10, 2008, 11:00AM



Domain adaptation in natural language

Hal Daumé   [homepage]

Univeristy of Utah

Supervised learning technology has led to systems for part of speech tagging, parsing, named entity recognition with accuracies in the high 90%s. Unfortunately, the performance of these systems degrades drastically when they are applied on text outside their training domain (typically, newswire). Machine translation systems work fantastically for translating Parliamentary proceedings, but fall down when applied to alternate domains. I'll discuss research that aims to understand what goes wrong when models are applied outside their domain, and some (partial) solutions to this problem. I'll focus on named entity recognition and machine translation tasks, where we'll see a range of different sources of error (some of which are quite counter-intuitive!).

Monday, February 9, 2009, 10:00AM



Breast Cancer Identification: KDD CUP Winner's Report

Claudia Perlich   [homepage]

IBM T.J. Watson Research Center

The KDD CUP 2008 was organized by Siemens Medical Solutions (http://www.kddcup2008.com/). They provided mammography based data for around 3000 patients. Siemens used proprietary software to extract from the original digital image data candidate regions and to characterize such regions in terms of 117 normalized numeric features with unknown interpretation. Task 1 was the identification of malignant candidate regions in mammography pictures with a ranking-based evaluation measure similar to ROC. Task 2 required submitting the longest list of healthy patients. Any submission with even one false negative was disqualified. Our winning submission exploited a) the properties of the evaluation metrics to improve the model scores from of a linear SVM and b) some form of data leakage that resulted in predictive information in the patient identifiers.

About the speaker:

Claudia Perlich has received her M.Sc. in Computer Science from Colorado University at Boulder, Diplom in Computer Science from Technische Universitaet in Darmstadt, and her Ph.D. in Information Systems from Stern School of Business, New York University. Her Ph.D. thesis concentrated on probability estimation in multi-relational domains that capture information of multiple entity types and relationships between them. Her dissertation was recognized as an additional winner of the International SAP Doctoral Support Award Competition and her submission placed second in the yearly data mining competition in 2003 (KDD-Cup 03). Claudia joined the Data Analytics Research group as a Research Staff Member in October 2004. She interned during summer 1999 at Deep Computing for Commerce Research Group under Murray Campbell working on financial trading behavior on Treasury Bonds. Her research interests are in machine learning for complex real-world domains and the comparative study of model performance as a function of domain characteristics.

Friday, February 13, 2009, 11:00AM



How speakers' eye movements reflect spoken language generation

Zenzi Griffin   [homepage]

The University of Texas at Austin

When people describe visually presented scenes, they gaze at each object for approximately one second before referring to it (Griffin & Bock, 2000). The time spent gazing at an object reflects the difficulty of selecting and retrieving a name for it (Griffin, 2001). Speakers even look at the objects that they intend to talk about for a second before they make speech errors (e.g., accidentally calling an axe "a hammer"; Griffin, 2004) and before they intentionally use inaccurate names to describe objects (e.g., deliberately calling a dog "a cat"; Griffin & Oppenheimer, 2006). Thus, speakers' eyes reveal when they prepare the words they use to refer to visible referents. Furthermore, recent experiments suggest that eye movement data may also constrain theories about syntactic planning in language production.

About the speaker:

Zenzi M. Griffin studies the processes that allow people to express anything using spoken language. She is particularly concerned with how people select and order words and phrases, and the way that they manage (and mismanage) the timing of word retrieval and the articulation of speech. Dr. Griffin is a graduate of the International Baccalaureate program at Kungsholmens Gymnasium in Stockholm, Sweden. She studied psychology at Stockholm University for one year before transferring to Michigan State University, where she earned a BA in Psychology and worked with Dr. Rose Zacks. In 1998, she received a Ph.D. in Cognitive Psychology (with a minor in Linguistics) from the Department of Psychology at the University of Illinois at Urbana-Champaign. There she worked with Dr. Kathryn Bock and Dr. Gary Dell. She spent three years as an assistant professor in the Department of Psychology at Stanford University. Dr. Griffin became an assistant professor in the School of Psychology at Georgia Tech in the summer of 2001 and was promoted to associate professor in 2005. She spent the 2006-2007 academic year as a visiting scientist at Hunter College in New York in the Language Acquisition Research Center. In 2008, she joined the Department of Psychology at the University of Texas at Austin as a full professor.

Friday, February 20, 2009, 11:00AM



A tempest: Or, On the flood of interest in sentiment analysis, opinion mining, and the computational treatment of subjective language

Lillian Lee   [homepage]

Cornell University

What do other people think?" has always been an important consideration to most of us when making decisions. Long before the World Wide Web, we asked our friends who they were planning to vote for and consulted Consumer Reports to decide which dishwasher to buy. But the Internet has (among other things) made it possible to learn about the opinions and experiences of those in the vast pool of people that are neither our personal acquaintances nor well-known professional critics --- that is, people we have never heard of. Enter sentiment analysis, a flourishing research area devoted to the computational treatment of subjective and opinion-oriented language. Sample phenomena to contend with range from sarcasm in blog postings to the interpretation of political speeches. This talk will cover some of the motivations, challenges, and approaches in this broad and exciting field.

About the speaker:

Lillian Lee is an associate professor of computer science at Cornell University. Her research interests include natural language processing, information retrieval, and machine learning. She is the recipient of the inaugural Best Paper Award at HLT-NAACL 2004 (joint with Regina Barzilay), a citation in "Top Picks: Technology Research Advances of 2004" by Technology Research News (also joint with Regina Barzilay), and an Alfred P. Sloan Research Fellowship, and her group's work has been featured in the New York Times.

Friday, February 27, 2009, 11:00AM



Efficient Visual Search and Learning

Kristen Grauman   [homepage]

University of Texas at Austin

Image and video data are rich with meaning, memories, or entertainment, and they can even facilitate communication or scientific discovery. However, our ability to capture and store massive amounts of interesting visual data has outpaced our ability to analyze it. Methods to search and organize images directly based on their visual cues are thus necessary to make them fully accessible. Unfortunately, the complexity of the problem often leads to approaches that will not scale: conventional methods rely on substantial manually annotated training data, or have such high computational costs that the representation or data sources must be artificially restricted. In this talk I will present our work addressing scalable image search and recognition. I will focus on our techniques for fast image matching and retrieval, and introduce an active learning strategy that minimizes the annotations that a human supervisor must provide to produce accurate models.

While generic distance functions are often used to compare image features, we can use a sparse set of similarity constraints to learn metrics that better reflect their underlying relationships. To allow sub-linear time similarity search under the learned metrics, we show how to encode the metric parametrization into randomized locality-sensitive hash functions. Our learned metrics improve accuracy relative to commonly-used metric baselines, while our hashing construction enables efficient indexing with learned distances and very large databases. In order to best leverage manual intervention, we show how the system itself can actively choose its desired annotations. Unlike previous work, our approach accounts for the fact that the optimal use of manual annotation may call for a combination of labels at multiple levels of granularity (e.g., a full segmentation on some images and a present/absent flag on others). I will provide results illustrating how these efficient strategies will enable a new class of applications that rely on the analysis of large-scale visual data, such as object recognition, activity discovery, or meta-data labeling.

About the speaker:

Kristen Grauman is a Clare Boothe Luce Assistant Professor in the Department of Computer Sciences at the University of Texas at Austin. Before joining UT in 2007, she received the Ph.D. and S.M. degrees from the MIT Computer Science and Artificial Intelligence Laboratory. Her research in computer vision and machine learning focuses on visual search and recognition. She is a Microsoft Research New Faculty Fellow, and a recipient of an NSF CAREER award and the Frederick A. Howes Scholar Award in Computational Science.

Friday, March 6, 2009, 11:00AM



Object Recognition with Part-Based Models

Pedro F. Felzenszwalb   [homepage]

University of Chicago

Object recognition is one of the most important and challenging problems in computer vision. One of the main difficulties is in developing representations that can effectively capture typical variations in appearance that occur within a class of objects. Deformable part-based models provide a natural and elegant framework for addressing this problem. However, they also lead to difficult computational problems. I will describe efficient algorithms we have developed for finding objects in images using deformable part-based models. I will also describe a set of techniques we have developed for training part-based models from weakly labeled data, using a formalism we call latent SVM. We have used these methods to implement a state-of-the-art system for finding objects from a generic category, such as people or cars, in cluttered images.

About the speaker:

Pedro Felzenszwalb is an associate professor in the Department of Computer Science at the University of Chicago. His research focuses on computer vision, algorithms and artificial intelligence. He received his PhD in computer science from MIT in 2003.

Monday, March 9, 2009, 3:00PM



Linking Documents to Encyclopedic Knowledge: Using Wikipedia as a Source of Linguistic Evidence

Rada Mihalcea   [homepage]

University of North Texas

Wikipedia is an online encyclopedia that has grown to become one of the largest online repositories of encyclopedic knowledge, with millions of articles available for a large number of languages. In fact, Wikipedia editions are available for more than 200 languages, with a number of entries varying from a few pages to more than one million articles per language.

In this talk, I will describe the use of Wikipedia as a source of linguistic evidence for natural language processing tasks. In particular, I will show how this online encyclopedia can be used to achieve state-of-the-art results on two text processing tasks: automatic keyword extraction and word sense disambiguation. I will also show how the two methods can be combined into a system able to automatically enrich a text with links to encyclopedic knowledge. Given an input document, the system identifies the important concepts in the text and automatically links these concepts to the corresponding Wikipedia pages. Evaluations of the system showed that the automatic annotations are reliable and hardly distinguishable from manual annotations. Additionally, an evaluation of the system in an educational environment showed that the availability of encyclopedic knowledge within easy reach of a learner can improve both the quality of the knowledge acquired and the time needed to obtain such knowledge.

About the speaker:

Rada Mihalcea is an Associate Professor of Computer Science at the University of North Texas. Her research interests are in lexical semantics, graph-based algorithms for natural language processing, and multilingual natural language processing. During 2004-2007, she acted as the president of the ACL Special Group on the Lexicon. She serves or has served on the editorial boards of the Journals of Computational Linguistics, Language Resources and Evaluations, Natural Language Engineering, and Research in Language in Computation. She is the recipient of a National Science Foundation CAREER award.

Wednesday, March 25, 2009, 11:00AM



Virtual Articulations for Coordinated Motion in High-DoF Robots

Marsette Vona   [homepage]

MIT

In this talk I show how to use virtual kinematic joints and links---"virtual articulations"---as the basis for a new kind of expressive, rapid, and generic graphical interface for specifying coordinated motions in robots with 10s to 100s of joints. Virtual links can represent task-relevant coordinate frames; virtual joints can parameterize task motion, and, by closing kinematic chains, constrain coordinated motions.

I demonstrate hardware results where NASA's 36-DoF All Terrain Hex Limbed Extra Terrestrial Explorer (ATHLETE) executes a variety of previously challenging coordinated motion tasks. I also show interactive operation of a simulated modular robot involving nearly 300 joints. These results are all based on my new articulated robot operations interface system, the Mixed Real/Virtual Interface, in which the operator attaches virtual articulations to a model of the robot and then interactively operates the combined structure. I cover core challenges in handling arbitrary topology, supporting a variety of joints, and scaling to large numbers of DoF with both convenience and speed.

About the speaker:

Marsette Vona is a Ph.D. candidate in EECS at MIT CSAIL with Professor Daniela Rus. His current work explores theory and applications for virtual articulations in robotics, operations interface software and hardware for exploration robots, and reliable compliant/proprioceptive climbing and walking strategies. From 2001 to 2003 Marsette was a software developer at NASA/JPL, where he helped build the award-winning science operations interface for the Mars Exploration Rover mission. His 2001 M.S. in EECS at MIT developed new techniques in precision metrology for machine tools, and his 1999 B.A. thesis in CS at Dartmouth College described a self- reconfiguring robot system based on compressing cube modules.

Thursday, March 26, 2009, 10:00AM



Towards Generic Knowledge Acquisition from Text

Lenhart K. Schubert   [homepage]

University of Rochester

Knowledge extraction from text has become an increasingly attractive way to tackle the long-standing "knowledge acquisition bottleneck" in AI, thanks to the burgeoning of on-line textual materials, and continuing improvements in natural language processing tools. In the KNEXT project (started 8 years ago at U. Rochester) we have been using compositional interpretation of parsed text to derive millions of general "factoids" about the world. Some examples, as translated from logical encodings into approximate English by KNEXT, are: CLOTHES CAN BE WASHED; PERSONS MAY SLEEP; A CHARGE OF CONSPIRACY MAY BE PROVEN IN SOME WAY; A MOUSE MAY HAVE A TAIL; A CAT MAY CATCH A MOUSE; etc. Viewed conservatively as existential or possibilistic statements, such factoids unfortunately do not provide a strong basis for reasoning. We would be better off with broadly quantified claims, such as that ANY GIVEN MOUSE ALMOST CERTAINLY HAS A TAIL, and IF A CAT CATCHES A MOUSE, IT WILL USUALLY KILL IT. How can we obtain such stronger knowledge? I will discuss several approaches that we are currently developing. Some involve further abstraction from KNEXT factoids using lexical semantic knowledge, while others involve direct interpretation of general facts stated in English. In all cases, issues in the formal representation of generic knowledge are encountered, of the type much-studied in linguistic semantics under such headings as "donkey anaphora", "dynamic semantics", and "generic passages". I will suggest a Skolemization approach which can be viewed as a method of generating frame-like or script-like knowledge directly from language.

About the speaker:

Lenhart Schubert is a professor of Computer Science at the University of Rochester, with primary interests in natural language understanding, knowledge representation and acquisition, reasoning, and self-awareness. While earning a PhD in Aerospace Studies at the University of Toronto, he became fascinated with AI and eventually joined the University of Alberta Computing Science Department and later (in 1988), the University of Rochester. He has over 100 publications in natural language processing and semantics, knowledge representation, reasoning, and knowledge acquisition, has chaired conference programs in these areas, and is a AAAI fellow and former Alexander von Humboldt fellow.

Friday, April 3, 2009, 11:00AM



Circumscriptive Event Calculus as Answer Set Programming

Joohyung Lee   [homepage]

Arizona State University

Answer set programming is a recent declarative programming paradigm oriented towards knowledge intensive applications. While its mathematical basis, the stable model semantics, originated from logic programming, its relationship to satisfiability checking (SAT) enabled a new method of computing answer sets using SAT solvers. More studies show that answer set programming is closely related to other knowledge representation formalisms, and provides a good computational mechanism for them together with many building block results.

This talk will focus on the relationship between the event calculus and answer set programing. The circumscriptive event calculus is a formalism for reasoning about actions and change, based on classical logic. By reformulating it as answer set programming, we show that answer set solvers can be used for event calculus reasoning. The method not only computes the full version of the event calculus (modulo grounding) but is also shown to be faster than the current SAT-based approach by a few orders of magnitude. The techniques developed here can be applied to studying the relationship between other classical logic formalisms and answer set programming.

The talk will consist of three parts: introduction to the event calculus; introduction to answer set programming; the relationship between them.

This is joint work with Tae-Won Kim and Ravi Palla.

About the speaker:

Joohyung Lee is an assistant professor of Computer Science and Engineering at Arizona State University. He received Ph.D. from the University of Texas at Austin. His research interests are in knowledge representation, logic programming and computational logics. A part of his work on relating nonmonotonic logics to classical logic won him an Outstanding Paper Honorable Mention Award at AAAI 2004.

Tuesday, April 14, 2009, 11:00AM



Probabilistic Models for Holistic Scene Understanding

Daphne Koller   [homepage]

Stanford University

Over recent years, computer vision has made great strides towards annotating parts of an image with symbolic labels, such as object categories (things) or segment types (stuff). However, we are still far from the goal of providing a semantic description of an image, such as "a man, walking a dog on a sidewalk, carrying a backpack". In this talk, I will describe some projects we have done that attempt to use probabilistic models to move us closer towards the goal.

The first part of the talk will present methods that use a more holistic scene analysis to improve our performance at core tasks such as object detection, segmentation, or 3D reconstruction. The second part of the talk will focus on finer-grained modeling of object shape, so as to allow us to annotate images with descriptive labels related to the object shape, pose, or activity (e.g., is a cheetah running or standing). These vision tasks rely on novel algorithms for core problems in machine learning and probabilistic models, such as efficient algorithms for probabilistic correspondence, transfer learning across related object classes for learning from sparse data, and more.

About the speaker:

Daphne Koller is a Professor of Computer Science at Stanford University. Her main research focus is in developing and using machine learning and probabilistic methods to model and analyze complex systems, and she is particularly interested in using these techniques to understand biological systems and the world around us. Professor Koller is the author of over 100 refereed publications, which have appeared in venues that include Science, Nature Genetics, and the Journal of Games and Economic Behavior. She is a Fellow of the American Association for Artificial Intelligence, and has received a number of awards, including the Sloan Foundation Faculty Fellowship in 1996, the ONR Young Investigator Award in 1998, the Presidential Early Career Award for Scientists and Engineers (PECASE) from President Clinton in 1999, the IJCAI Computers and Thought Award in 2001, the Cox Medal for excellence in fostering undergraduate research at Stanford in 2003, the MacArthur Foundation Fellowship in 2004 and the first-ever ACM/Infosys award in 2008.

Monday, April 27, 2009, 11:00AM



Learning visual human actions from movies

Cordelia Schmid   [homepage]

INRIA

We address the problem of recognizing natural human actions in diverse and realistic video settings. This challenging but important subject has mostly been ignored in the past due to several problems one of which is the lack of realistic and annotated video datasets. Our first contribution is to address this limitation and to investigate the use of movie scripts for automatic annotation of human actions in videos. We evaluate alternative methods for action retrieval from scripts and show benefits of a text-based classifier. Using the retrieved action samples for visual learning, we next turn to the problem of action classification in video. We present a new method for video classification that builds upon and extends several recent ideas including local space-time features, space-time pyramids and multi-channel non-linear SVMs. Furthermore, we show how to mine relevant context information and use it to improve action recognition. We finally apply the method to learning and classifying challenging action classes in movies and show promising results.

This is joint work with I. Laptev, M. Marszalek and B. Rozenfeld.

Friday, May 29, 2009, 11:00AM



Balancing Multi-agent Exploration and Exploitation in Time-Critical Domains

Matthew Taylor   [homepage]

University of Southern California

Substantial work has investigated balancing exploration and exploitation, but relatively little has addressed this tradeoff in the context of coordinated multi-agent interactions. In this talk I will introduce a class of problems in which agents must maximize their on-line reward, a decomposable function dependent on pairs of agents decisions. Unlike previous work, agents must both learn the reward function and exploit it on-line, critical properties for a class of physically motivated systems, such as mobile wireless networks. I will introduce algorithms motivated by the Distributed Constraint Optimization Problem framework and demonstrate when, and at what cost, increasing agents coordination can improve the global reward on such problems.

I will also briefly discuss a couple of additional projects I have worked on at USC in the past two semesters.

About the speaker:

Matthew E. Taylor is a postdoctoral research associate at the University of Southern California working under Milind Tambe. He graduated magna cum laude with a double major in computer science and physics from Amherst College in 2001. After working for two years as a software developer, he began his Ph.D. with a MCD fellowship from the College of Natural Sciences. He received his doctorate from the Department of Computer Sciences at the University of Texas at Austin in the summer of 2008. Current research interests include multi-agent systems, reinforcement learning, and transfer learning.

Monday, June 22, 2009, 11:00AM



Multi-Agent Reinforcement Learning for Urban Traffic Control using Coordination Graphs

Shimon Whiteson   [homepage]

University of Amsterdam

Since traffic jams are ubiquitous in the modern world, optimizing the behavior of traffic lights for efficient traffic flow is a critically important goal. Though most current traffic lights use simple heuristic protocols, more efficient controllers can be discovered automatically via multi-agent reinforcement learning, where each agent controls a single traffic light. However, in previous work on this approach, agents select only locally optimal actions without coordinating their behavior. In this talk I describe how to extend this approach to include explicit coordination between neighboring traffic lights. Coordination is achieved using the max-plus algorithm, which estimates the optimal joint action by sending locally optimized messages among connected agents. I present the first application of max-plus to a large-scale problem and provide empirical evidence that max-plus performs well on cyclic graphs, though it has been proven to converge only for tree-structured graphs. I also discuss our future plans for tackling traffic control problems by formalizing them as loosely-coupled multi-objective multiagent control problems.

About the speaker:

Shimon Whiteson is an Assistant Professor in the Intelligent Autonomous Systems group at the University of Amsterdam. He received his PhD in 2007 from the University of Texas at Austin, working with Peter Stone. His current research interests focus on single- and multi-agent decision-theoretic planning and reinforcement learning.

[ FAI Archives ]

Fall 2022 - Spring 2023

Fall 2021 - Spring 2022

Fall 2020 - Spring 2021

Fall 2019 - Spring 2020

Fall 2018 - Spring 2019

Fall 2017 - Spring 2018

Fall 2016 - Spring 2017

Fall 2015 - Spring 2016

Fall 2014 - Spring 2015

Fall 2013 - Spring 2014

Fall 2012 - Spring 2013

Fall 2011 - Spring 2012

Fall 2010 - Spring 2011

Fall 2009 - Spring 2010

Fall 2008 - Spring 2009

Fall 2007 - Spring 2008

Fall 2006 - Spring 2007

Fall 2005 - Spring 2006

Spring 2005

Fall 2004

Spring 2004

Fall 2003

Spring 2003

Fall 2002

Spring 2002

Fall 2001

Spring 2001

Fall 2000

Spring 2000