Forum for Artificial Intelligence


[ About FAI   |   Talks and Abstracts ]

About FAI

The Forum for Artificial Intelligence meets every other week (or so) to discuss scientific, philosophical, and cultural issues in artificial intelligence. Both technical research topics and broader inter-disciplinary aspects of AI are covered, and all are welcome to attend!

If you would like to be added to the FAI mailing list, or have any questions or comments, please send email to Misha Bilenko, Nick Jong, or Peter Yeh.

Talks and Abstracts

Friday May 5
3:30pm, ACE 6.304
Ilkka Niemela,
Helsinki University of Technology
Towards Efficient Boolean Circuit Satisfiability Checking
Friday Apr. 14
11:00am, ACES 2.302
Eduard Hovy,
University of Southern California
Learning by Reading: An Experiment in Text Analysis
Friday Apr. 7
11:00am, ACES 6.304
Victor Marek,
University of Kentucky
Set-based logic programming
Friday Mar. 31
3:00pm, ACES 2.302
Jason Baldridge,
University of Texas at Austin
Data-Driven Discourse Parsing
Friday Mar. 10
11:00am, ACES 2.302
Daniel Lee,
University of Pennsylvania
Learning in Artificial Sensorimotor Systems
Friday Feb. 24
11:00am, ACES 2.302
David Goldberg,
University of Illinois at Urbana-Champaign
The Key to Speed: How Supermultiplicative Speedups Enable the Optimization of Very Large, Hard Problems
Wednesday Feb. 22
4:00pm, ACES 2.302 (Avaya Auditorium)
Rob Holte,
University of Alberta
Effective Short-Term Opponent Exploitation in Simplified Poker
Friday Feb. 17
3:00pm, ACES 2.402
Andrew Barto,
University of Massachussetts at Amherst
Intrinsic Motivation and Computational Reinforcement Learning
Friday Feb. 10
11:00am, ACES 6.304
Lise Getoor,
University of Maryland
Link Mining
Friday Jan. 27
11:00am, ACES 2.302
Satinder Singh,
University of Michigan, Ann Arbor
Rethinking State, Action, and Reward in Reinforcement Learning
Thursday Jan. 26
11:00am, ACES 2.402
David Moriarty,
Apple Computers
Detecting Online Credit Card Fraud: A Data Driven Approach
Friday Jan. 20
11:00am, ACES 2.302
Thorsten Joachims,
Cornell University
Support Vector Machines for Structured Outputs
Friday Dec. 16
11:00am, TAY 3.128
Pat Langley,
Institute for the Study of Learning and Expertise and Stanford University
Learning Hierarchical Task Networks from Problem Solving
Friday Dec. 9
11:00am, ACES 2.402
Marty Mayberry,
Universitat des Saarslandes
A Connectionist Model of Sentence Comprehension in Visual Worlds
Thursday Dec. 1
11:00am, TAY 3.128
G. Michael Youngblood,
University of Texas at Arlington
Deconstructing the First-Person Shooter to Understand Human-Consistency and Transfer Learning to Create Better Artificially Intelligent Players
Monday Nov. 21
1:00pm, ACES 2.302
Dan Roth,
University of Illinois at Urbana-Champaign
Global Inference in Learning for Natural Language Processing
Friday Nov. 11
11:00am, ACES 2.302
Ted Gibson,
MIT
Top-down and bottom-up influences in human language comprehension
Friday Nov. 11
3:00pm, TAY 3.128
Carl E. Hewitt,
MIT
Semantics for Autonomy and Interdependence in Agents
Thursday Oct. 27
3:45pm, TAY 2.106
Leslie G. Valiant,
Harvard University
A Quantitative Theory of Neural Computation
Friday Sep. 29
3:30pm, TAY 3.128
Fernando Fernandez,
Carnegie Mellon University
Probabilistic Policy Reuse

Friday, May 5th, 3:30pm

Coffee at 3:15pm

ACE 6.304

Towards Efficient Boolean Circuit Satisfiability Checking

Dr. Ilkka Niemela   [homepage]

Head of the Laboratory for Theoretical Computer Science
Helsinki University of Technology

Boolean circuits offer a natural, structured, and compact representation of Boolean functions for many application domains such as computer aided verification. We study satisfiability checking methods for Boolean circuits. As a starting point we take the successful Davis-Putnam-Logemann-Loveland (DPLL) procedure for satisfiability checking of propositional formulas in conjunctive normal form and study its generalization to Boolean circuits. We employ a tableau formulation where DPLL propagation rules correspond to tableau deduction rules and splitting corresponds to a tableau cut rule. It turns out that Boolean circuits enable interesting deduction (simplification) rules not typically available in DPLL where the idea is to exploit the structure of the circuit. We also study the relative efficiency of different variations of the cut (splitting) rule obtained by restricting the use of cut in several natural ways. A number of exponential separation results are obtained showing that the more restricted variations cannot polynomially simulate the less restricted ones. The results also apply to DPLL for formulas in conjunctive normal form obtained from Boolean circuits by using Tseitin's translation. Thus DPLL with the considered cut restrictions, such as allowing splitting only on the variables corresponding to the input gates, cannot polynomially simulate DPLL with unrestricted splitting.

(This is joint work with Tommi Junttila and Matti Jarvisalo.)

About the speaker:

Ilkka Niemela is professor and head of the Laboratory for Theoretical Computer Science at Helsinki University of Technology since year 2000. He received his doctoral degree in computer science in 1993 from Helsinki University of Technology and has worked in 1993 as an International Fellow at SRI International, in 1995-1996 as a research scientist and acting professor in the Department of Computer Science of the University of Koblenz-Landau, Germany and in 1998-2000 as a senior research fellow of the Academy of Finland.

Dr. Niemela's current research interests include automated reasoning, knowledge representation, computational complexity, computer aided verification, automated testing and product configuration. At Helsinki University of Technology he leads the computational logic group which has developed a number of the state-of-the-art software tools for automated reasoning, such as the Smodels system for answer set programming and BCSat for Boolean circuit satisfiability checking, leading to applications in areas like automated planning, product configuration, and bounded model checking. Dr. Niemela is an author of more than 100 papers, has been a member of the program committee for over 40 international conferences and has given several invited talks and tutorials.

Dr. Niemela is a member of the Executive Committee of the Association for Logic Programming (ALP), Editorial Board Member of Theory and Practice of Logic Programming and Journal of Artificial Intelligence Research as well as a Steering Committee Member of the International Workshops on Nonmonotonic Reasoning and of the International Conferences on Logic Programming and Nonmonotonic Reasoning.

Friday, April 14th, 11:00am

Coffee at 10:45am

ACES 2.302 (Avaya Auditorium)

Learning by Reading: An Experiment in Text Analysis

Dr. Eduard Hovy   [homepage]

USC Information Sciences Institute
University of Southern California

[Sign-up schedule for individual meetings]

A few years ago, three research groups participated in an audacious experiment called Project Halo: (manually) converting the information contained in one chapter of a high school chemistry textbook into knowledge representation statements, and then having the knowledge representation system take the high school AP exam. Surprisingly, all three systems passed, albeit at a relatively low level of performance. Could one do the same, automatically? If not fully, how far can one go? Since October, several projects have taken up this challenge, or aspects of it. Our Learning by Reading project at ISI, drawing part-time participation of experts in NLP and KR&R, addresses the problem from the perspective of NLP. After suitable analysis and preparation, we parse the Chemistry textbook and then convert the results into very shallow pre-logic predications, which are asserted to a knowledge base. The evaluation, still in progress, has two aspects. In the first, we apply questions to the system at various levels (text-only, knowledge level without inference, the latter with inference), and compare performance. In the second, we (in conjunction with other groups) compare the systems bottom-up automatically derived representations to the top-down ones created by hand by those groups. Although this project (and related) projects are merely pilot studies, they nonetheless are likely to generate some interesting conclusions regarding the gap between what automated systems can deliver and what human knowledge engineers deem necessary, in the fascinating endeavor of learning by reading.

About the speaker:

Eduard Hovy leads the Natural Language Research Group at the Information Sciences Institute of the University of Southern California. He is also Deputy Director of the Intelligent Systems Division, as well as a research associate professor of the Computer Science Department of USC and Advisory Professor of the Beijing University of Posts and Telecommunications. He completed a Ph.D. in Computer Science (Artificial Intelligence) at Yale University in 1987. His research focuses on information extraction, automated text summarization, question answering, the semi-automated construction of large lexicons and ontologies, machine translation, and digital government. Dr Hovy regularly serves in an advisory capacity to funders of NLP research in the US and EU. He is the author or co-editor of five books and over 170 technical articles. In 2001 Dr. Hovy served as President of the Association for Computational Linguistics (ACL) and in 2001-03 as President of the International Association of Machine Translation (IAMT). Dr. Hovy regularly co-teaches a course in the Masters Degree Program in Computational Linguistics at the University of Southern California, as well as occasional short courses on MT and other topics at universities and conferences. He has served on the Ph.D. and M.S. committees for students from USC, Carnegie Mellon University, the Universities of Toronto, Karlsruhe, Pennsylvania, Stockholm, Waterloo, Nijmegen, Pretoria, and Ho Chi Minh City.

Friday, April 7th, 11am

Coffee at 10:45am

ACES 6.304

Set-based logic programming

Dr. Victor Marek   [homepage]

Department of Computer Sciences
University of Kentucky

[Sign-up schedule for individual meetings]

Logic Programming (and one of its practical versions Answer Set Programming - ASP), since its inception has been trying to follow two directions at once. One is treating programs as sets of formulas of some logic formalism. The other is an approach related to the treatment of programs as descriptions of inductively defined sets of objects.

Recently, with the spectacular results of researchers in the first area showing that there is a logic that correctly formalizes fundamental aspects of answer set programming, one could get the impression that the dichotomy of approaches presented above has been settled, with logic (in this case maximal intermediate logic N) providing the ultimate explanation of the subject.

This presentation discusses the other approach to ASP and its consequences. We show that by assigning a "sense" to atoms, one can use the ASP paradigm in the areas different from the classical logic programming. In this perspective it turns out that the Gelfond-Lifschitz stable models involve two, not one, distinct concepts. We show how various classical mathematical notions can be formalized in the resulting formalism.

The reported research is joint with H.A. Blair of Syracuse University and J.B. Remmel of UCSD.

About the speaker:

V.W. Marek received his Ph.D. in 1968, and D.Sc. in 1972 from the University of Warsaw, Poland. Originally interested in Set Theory and Recursion Theory, we changed his interests in the middle of 1980ies to the issues of Knowledge Representation and Nonmonotonic Reasoning.

For a number of years he taught at Warsaw University where, as a successor of Mostowski he was head of Foundations of Mathematics Group. In 1983 he moved to the University of Kentucky, Computer Science. He spent longer periods of time at Cornell University at the Mathematical Sciences Institute, where for several years he was an associate researcher, and later at the University of California, San Diego.

He is an author of four books (most recently on Satisfiability), and an author of over 150 journal and refereed conference papers. He was a member of program committees of numerous conferences and started the conference series "Logic Programming and Nonmonotonic Reasoning".

Friday, March 31st, 3pm

Coffee at 2:45pm

ACES 2.302

Data-Driven Discourse Parsing

Dr. Jason Baldridge   [homepage]

Department of Linguistics
University of Texas at Austin

Computing the structure of discourse is both representationally and computationally challenging. It is largely agreed that discourses consists of segments that are related to one another through rhetorical relations and the goals and intentions of the speaker(s). While some theories postulate a context-free tree representation of discourse structure, there are strong arguments that quite general acyclic graphs are representationally necessary for adequately capturing the rhetorical connections of discourse segments within a text or dialog. This leads to an explosion of alternative potential analyses that is difficult to reign in even with very sophisticated machine learning models. Another challenge is that there are many sources of information --e.g., sentence moods, discourse cue phrases, goals and intentions, and domain-specific information-- that go into the determination of segmentation and rhetorical relationships. This information can be difficult to utilize effectively, especially in the face of data sparsity.

In this talk, I will discuss data and a statistical parser for analyzing appointment scheduling dialogs. The parser, which is based on the sentence parsing models of Collins, builds discourse structures of Segmented Discourse Representation Theory. I will highlight some of the adequacies and inadequacies of this approach for this task, and then present a new approach based on recent developments in sentence-level dependency parsing. Though this approach brings with it new representational challenges, it promises to greatly improve both the process of annotation and accuracy in the automatic recovery of discourse structures.

About the speaker:

Jason Baldridge is an assistant professor in the Department of Linguistics at the University of Texas at Austin. He completed his dissertation on categorial grammars at the University of Edinburgh in 2002, advised by Mark Steedman. From 2002 to 2005, he held a post-doctoral position at Edinburgh working with Alex Lascarides and Miles Osborne. His current work includes research on probabilistic parsing for Portuguese, discriminative parse ranking models, probabilistic discourse parsing for Segmented Discourse Representation Theory, active learning, and formal syntax using categorial grammars and other constraint-based formalisms. With Nicholas Asher, he recently began a NSF-funded project to investigate the integration of discourse structure and coreference resolution using machine learning. He has been active for many years in the creation and promotion of open source software for natural language processing.

Friday Mar. 10, 11:00am

Coffee at 10:45am

ACES 2.302 Avaya Auditorium

Learning in Artificial Sensorimotor Systems

Dr. Daniel Lee   [homepage]

Department of Electrical and Systems Engineering
University of Pennsylvania

[Sign-up schedule for individual meetings]

Many algorithms in machine learning involve changing the underlying dimensionality of the data set. Unsupervised learning techniques such as principal components analysis typically involve dimensionality reduction, whereas supervised learning techniques such as support vector machines can be understood as mapping the data to a higher dimensional space. Equivalent problems emerge when considering information processing in sensorimotor systems. Sensory processing requires mapping high-dimensional sensory inputs onto a smaller number of perceptually-relevant features, whereas motor learning involves driving a large number of actuator parameters with a smaller number of control variables. I will describe some of our recently developed learning algorithms that utilize changes in dimensionality, and demonstrate their application on some prototypical robotic systems.

About the speaker:

Daniel D. Lee is currently an Associate Professor of Electrical and Systems Engineering at the University of Pennsylvania, with a secondary appointment in the Department of Bioengineering. He received his B.A. in Physics from Harvard University in 1990, and his Ph.D. in Condensed Matter Physics from the Massachusetts Institute of Technology in 1995. He was a researcher at Bell Laboratories, Lucent Technologies, from 1995-2001 in the Theoretical Physics and Biological Computation departments. His research focuses on understanding the general principles that biological systems use to process and organize information, and on applying that knowledge to build better artificial sensorimotor systems. He resides in New Jersey with his wife Lisa, four-year old son Jordan, and two-year old daughter Jessica.

Friday, February 24th, 11am

Coffee at 10:45am

ACES 2.302 Avaya Auditorium

The Key to Speed: How Supermultiplicative Speedups Enable the Optimization of Very Large, Hard Problems

Dr. David Goldberg   [homepage]

Department of Industrial and Enterprise Systems Engineering
University of Illinois at Urbana-Champaign

[Sign-up schedule for individual meetings]

Genetic algorithms (GAs)—search procedures based on the mechanics of natural selection and genetics—have been used across the spectrum of human endeavor, especially in problems that defy solution by traditional methods of search, optimization, and machine learning. The folklore of GAs suggests that they can often be effective performers, but they are rarely considered to be speed demons, and some authors complain that they are notoriously slow. Recent work has established a design theory and methodology for competent GAs—genetic algorithms that solve large classes of hard problems, quickly, reliably and accurately—and those studies have been augmented by a growing literature GA efficiency enhancement that explores a variety of methods for speeding GA solutions. The news of competence and efficiency study has been good, suggesting that nearly decomposable problems can be solved in subquadratic times and that efficiency enhancement modes such as parallelism, time continuation, hybridization, and evaluation relaxation can be combined to yield multiplicative speedups over competent GAs without efficiency enhancement. These results are welcome in and of themselves, but a number of recent studies have broken through the multiplicative barrier to achieve supermultiplicative speedups to hard problems through the tight integration of model building and efficiency enhancement. These results are just now being understood and fully explored, but they offer the promise of optimizing very large, hard problems day in and day out. This talk explores the foundations, principles, and promise of supermultiplicative speedup of competent procedures. Examples will be given of the integration of distribution estimation with evaluation relaxation and time continuation, and a final discussion will suggest how these results are leading to the near-term possibility of using these techniques to effectively optimize problems with 100s of millions or even billions of variables.

About the speaker:

David E. Goldberg (BSE, 1975, MSE, 1976, PhD, 1983, Civil Engineering, University of Michigan, Ann Arbor) is the Jerry S. Dobrovolny Distinguished Professor of Entrepreneurial Engineering at the University of Illinois at Urbana-Champaign (UIUC) and director of the Illinois Genetic Algorithms Laboratory (IlliGAL, http://www-illigal.ge.uiuc.edu/). Between 1976 to 1980 he held a number of positions at Stoner Associates of Carlisle, PA, including Project Engineer and Marketing Manager. Following his doctoral studies he joined the Engineering Mechanics faculty at the University of Alabama, Tuscaloosa, in 1984, and he moved to the University of Illinois in 1990. Professor Goldberg was a 1985 recipient of a U.S. National Science Foundation Presidential Young Investigator Award, and in 1995 he was named an Associate of the Center for Advanced Study at UIUC. He was founding chairman of the International Society for Genetic and Evolutionary Computation (http://www.isgec.org/), and his book Genetic Algorithms in Search, Optimization and Machine Learning (Addison-Wesley, 1989) is one of the most widely cited texts in computer science. His research focuses on the design, analysis, and application of genetic algorithms-computer procedures based on the mechanics of natural genetics and selection-and other innovating machines. His recent book, The Design of Innovation: Lessons from and for Competent Genetic Algorithms (http://www-doi.ge.uiuc.edu/), discusses (1) how to design scalable genetic algorithms and (2) how such procedures are similar to certain processes of human innovation.

Wednesday, February 22nd, 4:00pm

Coffee at 3:45pm

Room ACES 2.302 (Avaya Auditorium)

Effective Short-Term Opponent Exploitation in Simplified Poker

Dr. Rob Holte   [homepage]

Department of Computer Sciences
University of Alberta

[Sign-up schedule for individual meetings]

Poker is a game filled with interesting AI challenges. Uncertainty in poker stems from two key sources, the shuffled deck and an adversary whose strategy is unknown. One approach is to find pessimistic game theoretic solutions (i.e. minimax), but human players have idiosyncratic weaknesses that can be effectively exploited if a model of their strategy can be learned by observing their play. However, games against humans last for at most a few hundred hands, so learning must be fast to be effective. We explore the effectiveness of two approaches to opponent modelling in the context of Kuhn poker, a small game for which game theoretic solutions are known. Parameter estimation and expert algorithms are both studied. Experiments demonstrate that, even in this small game, convergence to maximally exploitive solutions in a small number of hands is impractical, but that good (i.e. better than Nash or breakeven) performance can be achieved in a short period of time. Finally, we also show that even amongst a set of strategies with equal game theoretic value, in particular the set of Nash equilibrium strategies, some are preferable because they speed learning of the opponent's strategy by exploring it more effectively.

About the speaker:

Professor Robert Holte is a well-known member of the international machine learning research community, former editor-in-chief of the leading international journal in this field (Machine Learning), and current director of the Alberta Ingenuity Centre for Machine Learning. His main scientific contributions are his seminal works on the problem of small disjuncts and the performance of very simple classification rules. His current machine learning research investigates cost-sensitive learning and learning in game-playing (for example: opponent modelling in poker, and the use of learning for gameplay analysis of commercial computer games). In addition to machine learning he undertakes research in single-agent search (pathfinding): in particular, the use of automatic abstraction techniques to speed up search. He has over 55 scientific papers to his credit, covering both pure and applied research, and has served on the steering committee or program committee of numerous major international AI conferences.

Friday, February 17th, 3:00pm

Coffee at 2:45pm

ACES 2.402

Intrinsic Motivation and Computational Reinforcement Learning

Dr. Andrew Barto   [homepage]

Department of Computer Science
University of Massachusetts at Amherst

[Sign-up schedule for individual meetings]

Motivation is a key factor in human learning. We learn best when we are highly motivated to learn. Psychologists distinguish between extrinsically-motivated behavior, which is behavior undertaken to achieve some externally supplied reward, such as a prize, a high grade, or a high-paying job, and intrinsically-motivated behavior, which is behavior done for its own sake. Is there an analogous distinction for machine learning systems? Can we say of a machine learning system that it is motivated to learn, and if so, can it be meaningful to distinguish between extrinsic and intrinsic motivation? Further, is intrinsic motivation something that we—as machine learning researchers—should care about? In this talk, I argue that the answer to each to each of these questions is “yes.” After presenting a brief overview of the history of ideas related to intrinsic motivation in machine learning, I describe some of our recent computational experiments that explore these ideas within the framework of computational reinforcement learning (RL). It is a common perception that computational RL only deals with extrinsic reward because an RL agent is typically seen as receiving reward signals only from its external environment. To the contrary, however, I argue that the computational RL framework is particularly well suited for incorporating principles of intrinsic motivation, and I present our view that extending learning in this direction is important for creating competent adaptive agents.

About the speaker:

Andrew Barto is Professor of Computer Science, University of Massachusetts, Amherst. He received his B.S. with distinction in mathematics from the University of Michigan in 1970, and his Ph.D. in Computer Science in 1975, also from the University of Michigan. He joined the Computer Science Department of the University of Massachusetts Amherst in 1977 as a Postdoctoral Research Associate, became an Associate Professor in 1982, and has been a Full Professor since 1991. He is Co-Director of the Autonomous Learning Laboratory and a core faculty member of the Neuroscience and Behavior Program of the University of Massachusetts. His research centers on learning in natural and artificial systems, and he has studied machine learning algorithms since 1977, contributing to the development of the computational theory and practice of reinforcement learning. His current research centers on models of motor learning and reinforcement learning methods for real-time planning and control, with specific interest in autonomous mental development through intrinsically motivated reinforcement learning.

Friday, February 10th, 11:00am

Coffee at 10:45am

ACES 6.304

Link Mining

Dr. Lise Getoor   [homepage]

Computer Science Department/UMIACS
University of Maryland at College Park

[Sign-up schedule for individual meetings]

A key challenge for data mining is tackling the problem of mining richly structured datasets, where the objects are linked in some way. Links among the objects may demonstrate certain patterns, which can be helpful for many data mining tasks and are usually hard to capture with traditional statistical models. Recently there has been a surge of interest in this area, fueled largely by interest in web and hypertext mining, but also by interest in mining social networks, security and law enforcement data, bibliographic citations and epidemiological records.

Link mining includes both descriptive and predictive modeling of link data. Classification and clustering in linked relational domains require new data mining models and algorithms. Furthermore, with the introduction of links, new predictive tasks come to light. Examples include predicting the numbers of links, predicting the type of link between two objects, inferring the existence of a link, inferring the identity of an object, finding co-references, and discovering subgraph patterns.

In this talk, I will give an overview of this newly emerging research area. I will describe novel aspects of the modeling, learning and inference tasks. Then, as time permits, I will describe some of my group's recent work on link-based classification and entity resolution in linked data.

About the speaker:

Prof. Lise Getoor is an assistant professor in the Computer Science Department at the University of Maryland, College Park. She received her PhD from Stanford University in 2001. Her current work includes research on link mining, statistical relational learning and representing uncertainty in structured and semi-structured data. Her work in these areas has been supported by NSF, NGA, KDD, ARL and DARPA. In July 2004, she co-organized the third in a series of successful workshops on statistical relational learning, http://www.cs.umd/srl2004. She has published numerous articles in machine learning, data mining, database and AI forums. She is a member of AAAI Executive council, is on the editorial board of the Machine Learning Journal and JAIR and has served on numerous program committees including AAAI, ICML, IJCAI, KDD, SIGMOD, UAI, VLDB, and WWW.

Friday, January 27th, 11:00am

Coffee at 10:45am

ACES 2.302 Avaya Auditorium

Rethinking State, Action, and Reward in Reinforcement Learning

Dr. Satinder Singh   [homepage]

Computer Science & Engineering
University of Michigan, Ann Arbor

[Sign-up schedule for individual meetings]

Over the last decade and more, there has been rapid theoretical and empirical progress in reinforcement learning (RL) using the well- established formalisms of Markov decision processes (MDPs) and partially observable MDPs or POMDPs. At the core of these formalisms are particular formulations of the elemental notions of state, action, and reward that have served the field of RL so well. In this talk, I will describe recent progress in rethinking these basic elements to take the field beyond (PO)MDPs. In particular, I will briefly describe older work on flexible notions of actions called options, briefly describe some recent work on intrinsic rather than extrinsic rewards, and then spend the bulk of my time on recent work on predictive representations of state. I will conclude by arguing that taken together these advances point the way for RL to address the many challenges of building an artificial intelligence.

About the speaker:

Satinder Singh is an Associate Professor of Electrical Engineering and Computer Science at the University of Michigan, Ann Arbor. His main research interest is in the old-fashioned goal of Artificial Intelligence, that of building autonomous agents that can learn to be broadly competent in complex, dynamic, and uncertain environments. The field of reinforcement learning (RL) has focused on this goal, and accordingly his deepest contributions are in RL. More recently he has also been contributing to computational game theory and mechanism design.

Thursday, January 26th, 11:00am

Coffee at 10:45am

ACES 2.402

Detecting Online Credit Card Fraud: A Data Driven Approach

Dr. David Moriarty   [homepage]

Director of Data Mining
Apple Computers

[Sign-up schedule for individual meetings]

A consistent problem plaguing online merchants today is the growth and evolution of online credit card fraud. Thieves harvest credit card numbers from a myriad of sources, place online orders from all over the world, and ship to countless drop locations. Current estimates place the problem at 1 to 1.5 billion dollars annually of which the online merchant holds complete liability.

In this talk, I will describe the problem of eCommerce fraud and outline various detection measures that online merchants employ. Like many merchants, Apple Computer leverages techniques from data mining, machine learning, and statistics to efficiently discover fraud patterns and adapt to new trends. Some topics that I will discuss include evaluating fraud patterns in historic data, discovering efficient pattern matching rules, building fraud predictive models, inferencing through order linkages, and anomaly detection.

About the speaker:

David Moriarty is the Director of Data Mining at Apple Computer, where he leads a group of scientists developing analytic solutions to large-scale business problems. Specifically, Dr. Moriarty leverages data patterns to optimize strategic decisions in various business areas, including fraud detection, product quality, logistics, and sales. Dr. Moriarty received a M.S. and Ph.D. in computer science from the University of Texas at Austin specializing in artificial intelligence and machine learning. He regularly serves on journal and conference review committees and is a founding member of Merchant Risk Council. Before Apple Computer, David designed intelligent algorithms at the Naval Research Laboratory, Daimler-Chrysler Research Center, USC Information Sciences Institute, and Intelligent Technologies Corporation.

Friday, January 20th, 11:00am

Coffee at 10:45am

ACES 2.302 Avaya Auditorium

Support Vector Machines for Structured Outputs

Dr. Thorsten Joachims   [homepage]

Department of Computer Science
Cornell University

[Sign-up schedule for individual meetings]

Over the last decade, much of the research on discriminative learning has focused on problems like classification and regression, where the prediction is a single univariate variable. But what if we need to predict complex objects like trees, orderings, or alignments? Such problems arise, for example, when a natural language parser needs to predict the correct parse tree for a given sentence, when one needs to optimize a multivariate performance measure like the F1-score, or when predicting the alignment between two proteins.

This talk discusses a support vector approach to predicting complex objects. It generalizes the idea of margins to complex prediction problems and a large range of loss functions. While the resulting training problems have exponential size, there is a simple algorithm that allows training in polynomial time. The algorithm is implemented in the SVM-Struct software and empirical results will be given for several examples.

About the speaker:

Thorsten Joachims is an Assistant Professor in the Department of Computer Science at Cornell University. In 2001, he finished his dissertation with the title "The Maximum-Margin Approach to Learning Text Classifiers: Methods, Theory, and Algorithms", advised by Prof. Katharina Morik at the University of Dortmund. From there he also received his Diplom in Computer Science in 1997 with a thesis on WebWatcher, a browsing assistant for the Web. His research interests center on a synthesis of theory and system building in the field of machine learning, with a focus on Support Vector Machines and machine learning with text. He authored the SVM-Light algorithm and software for support vector learning. From 1994 to 1996 he was a visiting scientist at Carnegie Mellon University with Prof. Tom Mitchell.

Friday, December 16th, 11:00am

Coffee at 10:45am

TAY 3.128

Learning Hierarchical Task Networks from Problem Solving

Dr. Pat Langley   [homepage]

Institute for the Study of Learning and Expertise and Stanford University

[Sign-up schedule for individual meetings]

In this talk, I present a novel approach to representing, utilizing, and learning hierarchical structures. The new formalism - teleoreactive logic programs - involves a special form of hierarchical task network that indexes methods by the goals they achieve. These structures can be used for reactive but goal-directed execution, and they can be interleaved with problem solving over primitive operators to address tasks for which there are no stored methods. Successful problem solving leads to the incremental creation of new methods that handle analogous tasks directly in the future. The learning module determines the structure of the hierarchy, the heads or indices of component methods, and the conditions on these methods. I report experiments on three domains that demonstrate rapid learning of both disjunctive and recursive structures that transfer well to more complex tasks. In closing, I discuss related research on learning from problem solving and propose directions for future research.

This talk describes work done jointly with Dongkyu Choi and Seth Rogers.

About the speaker:

Dr. Pat Langley serves as Director of the Institute for the Study of Learning and Expertise, Consulting Professor of Symbolic Systems at Stanford University, and Head of the Computational Learning Laboratory at Stanford's Center for the Study of Language and Information. He has contributed to the fields of artificial intelligence and cognitive science for over 25 years, having published 200 papers and five books on these topics, including the text Elements of Machine Learning. Professor Langley is considered a co-founder of the field of machine learning, where he championed both experimental studies of learning algorithms and their application to real-world problems before either were popular and before the phrase `data mining' became widespread. Dr. Langley is a AAAI Fellow, he was founding Executive Editor of the journal Machine Learning, and he was Program Chair for the Seventeenth International Conference on Machine Learning. His research has dealt with learning in planning, reasoning, language, vision, robotics, and scientific knowledge discovery, and he has contributed novel methods to a variety of paradigms, including logical, probabilistic, and case-based learning. His current research focuses on methods for constructing explanatory process models in scientific domains and on integrated architectures for intelligent physical agents.

Friday, December 9th, 11:00am

Coffee at 10:45am

ACES 2.402

A Connectionist Model of Sentence Comprehension in Visual Worlds

Dr. Marty Mayberry  [homepage]

Department of Computational Linguistics and Phonetics
Universitat des Saarslandes

Recent "visual worlds" studies, wherein researchers study language in situated settings by monitoring eye-movements in a visual scene during sentence processing, have revealed much about the interaction of scene information with other information sources such as linguistic, semantic, and world knowledge, as well as the time course of their influence on comprehension. These studies underscore the highly adaptive nature of the human sentence processor to use all available information in order to more rapidly interpret and disambiguate a sentence, and even anticipate upcoming arguments. Furthermore, some of these experiments have begun to provide insight into how the acquisition of language affects the influence of different information sources.

In this talk, I will describe the modelling of five experiments that trade off scene context with a variety of linguistic factors using a Simple Recurrent Network that has been modified to integrate a scene representation with the standard incremental input of a sentence. The results show that the model captures the qualitative behavior observed during the experiments, while retaining the ability to develop the correct interpretation in the absence of visual input. Moreover, the network correctly models the empirical observation of the relative importance of visual context over learned stereotypical associations and supports a developmental account of how this preference is acquired.

About the speaker:

Marshall R. Mayberry, III is currently a postdoctoral researcher at the Department of Computational Linguistics and Phonetics in Saarbruecken, Germany. He received his Ph.D. in Computer Science in 2003 from the University of Texas at Austin, where he also received a MS in Computer Science in 1998, as well as a BS in both Computer Science and Mathematics in 1993.

His research interests have revolved around the modelling of human language sentence processing with neural networks. The focus of these models has been on performance issues such as incrementality, anticipation, adaptation, and robustness. His dissertation work demonstrated that a neural network model could be scaled up to parse sentences from the medium-sized LinGO Redwoods Treebank into semantic graph structures. His current research has concentrated on the modelling of how a variety of linguistic and non-linguistic factors interact during sentence processing.

Thursday, December 1st, 11:00am

Coffee at 10:45am

TAY 3.128

Deconstructing the First-Person Shooter to Understand Human-Consistency and Transfer Learning to Create Better Artificially Intelligent Players

Dr. G. Michael Youngblood  [homepage]

CSE Department
University of Texas at Arlington

Computer games provide fertile ground for the study of how humans play games, interact with objects in virtual environments, and transfer knowledge across scenarios. The genre of first-person shooter (FPS) games comprises 16.3% percent of the electronic entertainment market and 19% of online gaming. They provide immersive, engaging, and highly interactive worlds that allow players to engage in behaviors similar to those in the real world. Our work has involved observing human players complete assigned tasks in these FPS games, deconstructing the immense environments of this genre into sets of interactive feature points to track human interaction, and evaluation of this captured data to assist in understanding the notions of human performance and human-consistency. We have produced our own agents based in part on an architecture for intelligence we designed named the D'Artagnan Cognitive Architecture (DCA). Understanding human and agent performance through a set of performance and clustering metrics, we have shown that a basic implementation of DCA was able to complete 29% of a reference level set of which 73% were within human performance levels and 15.4% were within our definition of human-consistency. We have further explored the use of neural network control of artificial characters in FPS games and spatial reasoning in these complex environments. Our current work involves developing a set of scenarios for understanding the transfer of knowledge between source and target environments and developing artificial agents that can exhibit the same types of knowledge transfer in our Urban Combat Testbed.

About the speaker:

Dr. Youngblood is currently a Faculty Research Associate, a PI for the DARPA funded and ISLE-led Transfer Learning effort, and Chief Scientist for the NSF ITR funded MavHome Project at the University of Texas at Arlington. He is a former US Navy Submariner, professional software engineer, and earned his Honors BS in Computer Science and Engineering in 1999, MS in CSE in 2002, and Ph.D. in CSE in 2005. His research interests are in Entertainment Computing, Intelligent Systems, Pervasive Computing, and Autonomous Systems.

Monday, November 21st, 1:00pm

Coffee at 12:45pm

Avaya Auditorium, ACES 2.302

Global Inference in Learning for Natural Language Processing

Dr. Dan Roth   [homepage]

Department of Computer Science and the Beckman Institute
University of Illinois at Urbana-Champaign

[Sign-up schedule for individual meetings]

Natural language decisions often involve assigning values to sets of variables where complex and expressive dependencies can influence, or even dictate, what assignments are possible. Dependencies may range from simple statistical correlations to those that are constrained by deeper structural, relational and semantic properties of the text.

I will describe research on a framework that combines learning and inference for this problem, of inferring structured and constrained output. The inference process of assigning globally optimal values to mutually dependent variables is formalized as an optimization problem and is solved as an integer linear programming (ILP) problem. Several key issues will be discussed, including the incorporation of both statistical and declarative constraints and training paradigms.

The work will be described in the context of the Semantic Role Labeling tasks, inferring a shallow semantic analysis of sentences at the level of "who did what to whom, how, when and why".

About the speaker:

Dan Roth is an Associate Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign and the Beckman Institute of Advanced Science and Technology (UIUC). He is a fellow of the Institute of Advanced Studies at the University of Illinois and a Willett Faculty Scholar of the College of Engineering. Prof. Roth got his B.A Summa cum laude in Mathematics from the Technion, Israel, in 1982 and his Ph.D in Computer Science from Harvard University in 1995.

Professor Roth's research spans both theoretical work in machine learning and intelligent reasoning and work on applying learning and inference to intelligent human-computer interaction -- focusing on learning and inference for natural languages understanding related tasks and intelligent access to free form textual information. Prof. Roth has published over 80 papers in machine learning, natural language processing, knowledge representation and reasoning and has developed a learning system that has been used widely in this field. His paper ``Learning in Natural Language'' received the best paper award in the International Joint Conference on Artificial Intelligence (IJCAI) 1999. Among other awards are the NSF CAREER Award (1999), the Xerox Award for Faculty Research (2001,2005), and the University of Illinois Award for Research with Undergraduates (2002).

Prof. Roth has presented several invited talks in international conferences including key note addresses in the Conference on Natural Language Learning (CoNLL-2000), Empirical Methods in Natural Language Processing (EMNLP-2002) and the European Conference on Machine Learning (ECML-2002). He was an editor of the Journal of Computational Linguistics, and is currently an action editor for the Machine Learning Journal and on the editorial board of Computational Intelligence and the Journal of Artificial Intelligence Research; he has served on the committees of all major conferences in Machine Learning, Learning Theory, Computational Linguistics and Artificial Intelligence. He was the program co-chair of CoNLL'02 and the program co-chair of ACL'03, the main international meeting of the Association for Computational Linguistics and the natural language processing community. He is currently the president (elected) of SIGNLL, the Association of Computational Linguistics, Special Interest Group on Natural Language Learning.

Friday, November 11th, 11:00am

Coffee at 10:45am

ACES 2.302 Avaya Auditorium

Top-down and bottom-up influences in human language comprehension

Dr. Ted Gibson   [homepage]

Department of Brain and Cognitive Sciences
MIT

In this talk, I will summarize recent language processing data from my lab that explore the influence of a variety of factors on human language comprehension, including syntactic structure, working memory, lexical frequency, and discourse context. Experimental evidence for a number of hypotheses will be summarized including:

(1) Local connections between syntactic/semantic dependents are easier to process than more distant connections. This factor helps to explain why sentences like (a) are so much easier to understand than sentences like (b), despite having the same words and meaning:
a. The reporter who the senator who John met at the party attacked was criticized by the editor.
b. At the party, John met the senator who attacked the reported who was criticized by the editor.

(2) The human sentence processor is sensitive to syntactic expectations that are relatively certain to occur, such as a verb following a sequence like "The claim that the baseball player would hold out for more money ...". The greater the number of open expectations, the greater the local processing load.

(3) The human sentence processor is sensitive to the frequencies of different senses of words. For example, the word "that" is 78% complementizer; 11% determiner; 11% pronoun in large corpora of written text. The high bias of complementizers affects people's processing of this word, even in environments where complementizers are virtually impossible, e.g., "I visited that hotel last week."

Data from multiple languages will be summarized, including English, Japanese and Chinese.

About the speaker:

Ted Gibson was educated at Queens University and Cambridge University, and received his Ph.D. in Computational Linguistics from Carnegie Mellon University in 1991. He is currently a Professor of Cognitive Science in the Department of Brain and Cognitive Sciences, with a joint position in the Linguistics Department at MIT, where he has been a faculty member since 1993.

His research interests include all factors that make putting words, phrases and sentences together easy or difficult to process, primarily in comprehension, but also in production. Four major research avenues that he has been pursuing in recent years are: (1) word order and sentence complexity / working memory and sentence complexity; (2) syntactic representational issues (e.g., are sentences context-free?); (3) discourse coherence representation issues (e.g., are discourse structures context-free?); and (4) the relationship between intonational boundary information and syntactic structure. Two primary kinds of methods are used in order to investigate these issues: (1) behavioral methods like reading and listening paradigms in order to gather reaction time and response accuracy data; and (2) corpus analyses.

Friday, November 11th, 3:00pm

Coffee at 2:45pm

TAY 3.128

Semantics for Autonomy and Interdependence in Agents

Dr. Carl E. Hewitt, Associate Professor (Emeritus)

Department of Electrical Engineering and Computer Science
MIT

In this talk I use Participatory Semantics for autonomy and interdependence in agents. It is based on Actor semantics (which in turn is based on physics) where Actors are the universal primitives of concurrent digital computation. In response to a message that it receives, an Actor can make local decisions, create more Actors, send more messages, and determine how to respond to the next message received. A serializer is an Actor that is continually open to the arrival of messages. A distinguishing characteristic of the Actor model is that every message sent to a serializer must arrive although this can take an unbounded amount of time. (However, the Actor model can be augmented with metrics.) Participatory Semantics of commitments provides means for studying issues of autonomy and interdependence in agents. Interdependence can be manifested in conversation and negotiation with others; autonomy in reasoning.

About the speaker:

Carl E. Hewitt is an Associate Professor (Emeritus) in the Electrical Engineering and Computer Science department at the Massachusetts Institute of Technology (MIT).

Carl is known for his design of Planner that was the first Artificial Intelligence programming language based on procedural plans that were invoked using pattern-directed invocation from assertions and goals.

Thursday, October 27th, 3:45pm

Coffee at 4:00pm

Taylor 2.106

A Quantitative Theory of Neural Computation

Dr. Leslie G. Valiant   [homepage]

Division of Engineering and Applied Sciences
Harvard University

A central open question of neuroscience is to identify the data structures and algorithms that are used in neural systems to support successive acts of basic tasks such as memorization and association. We describe a theory of neural computation based on three physical parameters: the number n of neurons, the number d of synaptic connections per neuron, and the inverse synaptic strength k expressed as the number of presynaptic action potentials needed to cause a postsynaptic action potential. Our fourth parameter r expresses the number of neurons that represent a real world item. We describe a computational mechanism for realizing hierarchical memorization and other cognitive tasks that implies a relationship among these four parameters. For the locust olfactory system estimates for all four parameters are available and we show that these numbers are in agreement with the theorys predictions. In human medial temporal lobe neurons that represent invariant concepts have been identified and we offer a quantitative mechanistic explanation of these otherwise paradoxical findings. More generally, we identify two useful regimes for neural computation, one with r and k large where each neuron may represent many items, and another in which r is small, k is 1 and every neuron represents at most one item.

About the speaker:

Leslie Valiant was educated at King's College, Cambridge; Imperial College, London; and at Warwick University where he received his Ph.D. in computer science in 1974. He is currently T. Jefferson Coolidge Professor of Computer Science and Applied Mathematics in the Division of Engineering and Applied Sciences at Harvard, where he has taught since 1982. Before coming to Harvard he had taught at Carnegie-Mellon University, Leeds University, and the University of Edinburgh.

His work has ranged over several areas of theoretical computer science, particularly complexity theory, computational learning, and parallel computation. He also works in computational neuroscience, where his interests include understanding memory and learning.

He received the Nevanlinna Prize at the International Congress of Mathematicians in 1986 and the Knuth Award in 1997. He is a Fellow of the Royal Society (London) and a member of the National Academy of Sciences (USA).

Thursday, September 29th, 3:30pm

Coffee at 3:15am

Taylor 3.128

Probabilistic Policy Reuse

Dr. Fernando Fernandez   [homepage]

Computer Science Department
Carnegie Mellon University

[Sign-up schedule for individual meetings]

We define Policy Reuse as a Reinforcement Learning technique guided by past policies offering the challenge of balancing among three choices: the exploitation of the ongoing learned policy, the exploration of new random actions, and the exploitation of past policies. Policy Reuse is based on two main cornerstones. On the one hand, an exploration strategy able to bias the exploration of the domain with a predefined past policy; on the other hand, a similarity metric that allows the estimation of the similarity of past policies with respect to a new one. Policy Reuse contributes three main capabilities to Reinforcement Learning in the life-long term. Firstly, it provides Reinforcement Learning algorithms with a mechanism to bias an exploration process by reusing a set of past policies that we call Policy Library; second, Policy Reuse provides an incremental method to build such a library of policies; and last, our method to build the Policy Library has a novel side-effect in terms of learning the structure of the domain, i.e., the basis or the "eigen-policies" of the domain. We demonstrate theoretically that, if some conditions are satisfied, reusing such a set of "eigen-policies" allows us to bound the minimal expected gain received while learning a new policy. We also provide empirical results demonstrating that Policy Reuse improves the learning performance over different strategies that learn from scratch.

About the speaker:

Fernando Fernandez is a postdoctoral fellow at the Computer Science Department of CMU. He received his Ph.D. degree in Computer Science from University Carlos III of Madrid (UC3M) in 2003. He received his B.Sc. in 1999 from UC3M, also in Computer Science. In the fall of 2000, Fernando was a visiting student at the Center for Engineering Science Advanced Research at Oak Ridge National Laboratory (Tennessee). From 2001, he became assistant and associate professor at UC3M. He is the recipient of a pre-doctoral FPU fellowship award from Spanish Ministry of Education (MEC), a Doctoral Prize from UC3M, and a MEC-Fulbright postdoctoral Fellowship.

Fernando is interested in intelligent systems that operate in continuous and stochastic domains. In his thesis, entitled "Reinforcement Learning in Continuous State Spaces", he studied different discretization methods of the state space in Reinforcement Learning problems, specifically Nearest Prototype approaches. Since he arrived to CMU, he has focused his research on the transfer of policies between different Reinforcement Learning tasks, and in how to bias the exploration of new learning processes with previously learned policies. Applications of his research include robot soccer, adaptive educational systems, and tourism support tools.

Past Schedules

Spring 2005

Fall 2004

Spring 2004

Fall 2003

Spring 2003

Fall 2002

Spring 2002

Fall 2001

Spring 2001

Fall 2000

Spring 2000