Machine Learning Research Group | University of Texas

Publications: Script Learning

An important part of understanding natural language is the incorporation of world knowledge. One possible way to encode world knowledge is in the form of scripts, which model stereotypical sequences of events. A script system learns scripts by generalizing such sequences of events from text. Recent work from our group in this area has explored new representations of events in scripts, and the use of recurrent neural networks to improve the learning of scripts.

Hide abstracts

SAGEViz: SchemA GEneration and Visualization
[Details] [PDF]
Sugam Devare, Mahnaz Koupaee, Gautham Gunapati, Sayontan Ghosh, Sai Vallurupalli, Yash Kumar Lal, Francis Ferraro, Nathanael Chambers, Greg Durrett, Raymond Mooney, Katrin Erk, Niranjan Balasubramanian
In Empirical Methods in Natural Language Processing (EMNLP) Demo Track, December 2023.
Schema induction involves creating a graph representation depicting how events unfold in a scenario. We present SAGEViz, an intuitive and modular tool that utilizes human-AI collaboration to create and update complex schema graphs efficiently, where multiple annotators (humans and models) can work simultaneously on a schema graph from any domain. The tool consists of two components: (1) a curation component powered by plug-and-play event language models to create and expand event sequences while human annotators validate and enrich the sequences to build complex hierarchical schemas, and (2) an easy-to-use visualization component to visualize schemas at varying levels of hierarchy. Using supervised and few-shot approaches, our event language models can continually predict relevant events starting from a seed event. We conduct a user study and show that users need less effort in terms of interaction steps with SAGEViz to generate schemas of better quality. We also include a video demonstrating the system.
ML ID: 424
Advances in Statistical Script Learning
[Details] [PDF] [Slides (PPT)]
Karl Pichotta
PhD Thesis, Department of Computer Science, The University of Texas at Austin, August 2017.
When humans encode information into natural language, they do so with the clear assumption that the reader will be able to seamlessly make inferences based on world knowledge. For example, given the sentence ``Mrs. Dalloway said she would buy the flowers herself,'' one can make a number of probable inferences based on event co-occurrences: she bought flowers, she went to a store, she took the flowers home, and so on.
Observing this, it is clear that many different useful natural language end-tasks could benefit from models of events as they typically co-occur (so-called script models). Robust question-answering systems must be able to infer highly-probable implicit events from what is explicitly stated in a text, as must robust information-extraction systems that map from unstructured text to formal assertions about relations expressed in the text. Coreference resolution systems, semantic role labeling, and even syntactic parsing systems could, in principle, benefit from event co-occurrence models.
To this end, we present a number of contributions related to statistical event co-occurrence models. First, we investigate a method of incorporating multiple entities into events in a count-based co-occurrence model. We find that modeling multiple entities interacting across events allows for improved empirical performance on the task of modeling sequences of events in documents.
Second, we give a method of applying Recurrent Neural Network sequence models to the task of predicting held-out predicate-argument structures from documents. This model allows us to easily incorporate entity noun information, and can allow for more complex, higher-arity events than a count-based co-occurrence model. We find the neural model improves performance considerably over the count-based co-occurrence model.
Third, we investigate the performance of a sequence-to-sequence encoder-decoder neural model on the task of predicting held-out predicate-argument events from text. This model does not explicitly model any external syntactic information, and does not require a parser. We find the text-level model to be competitive in predictive performance with an event level model directly mediated by an external syntactic analysis.
Finally, motivated by this result, we investigate incorporating features derived from these models into a baseline noun coreference resolution system. We find that, while our additional features do not appreciably improve top-level performance, we can nonetheless provide empirical improvement on a number of restricted classes of difficult coreference decisions.
ML ID: 348
Statistical Script Learning with Recurrent Neural Networks
[Details] [PDF] [Poster]
Karl Pichotta and Raymond J. Mooney
In Proceedings of the Workshop on Uphill Battles in Language Processing (UBLP) at EMNLP 2016, Austin, TX, November 2016.
We describe some of our recent efforts in learning statistical models of co-occurring events from large text corpora using Recurrent Neural Networks.
ML ID: 336
Using Sentence-Level LSTM Language Models for Script Inference
[Details] [PDF] [Slides (PPT)] [Slides (PDF)]
Karl Pichotta and Raymond J. Mooney
In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL-16), 279--289, Berlin, Germany, 2016.
There is a small but growing body of research on statistical scripts, models of event sequences that allow probabilistic inference of implicit events from documents. These systems operate on structured verb-argument events produced by an NLP pipeline. We compare these systems with recent Recurrent Neural Net models that directly operate on raw tokens to predict sentences, finding the latter to be roughly comparable to the former in terms of predicting missing events in documents.
ML ID: 330
Learning Statistical Scripts with LSTM Recurrent Neural Networks
[Details] [PDF] [Slides (PPT)] [Slides (PDF)]
Karl Pichotta and Raymond J. Mooney
In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI-16), Phoenix, Arizona, February 2016.
Scripts encode knowledge of prototypical sequences of events. We describe a Recurrent Neural Network model for statistical script learning using Long Short-Term Memory, an architecture which has been demonstrated to work well on a range of Artificial Intelligence tasks. We evaluate our system on two tasks, inferring held-out events from text and inferring novel events from text, substantially outperforming prior approaches on both tasks.
ML ID: 325
Statistical Script Learning with Recurrent Neural Nets
[Details] [PDF] [Slides (PDF)]
Karl Pichotta
December 2015. PhD proposal, Department of Computer Science, The University of Texas at Austin.
Statistical Scripts are probabilistic models of sequences of events. For example, a script model might encode the information that the event "Smith met with the President" should strongly predict the event "Smith spoke to the President." We present a number of results improving the state of the art of learning statistical scripts for inferring implicit events. First, we demonstrate that incorporating multiple arguments into events, yielding a more complex event representation than is used in previous work, helps to improve a co-occurrence-based script system's predictive power. Second, we improve on these results with a Recurrent Neural Network script sequence model which uses a Long Short-Term Memory component. We evaluate in two ways: first, we evaluate systems' ability to infer held-out events from documents (the "Narrative Cloze" evaluation); second, we evaluate novel event inferences by collecting human judgments.
We propose a number of further extensions to this work. First, we propose a number of new probabilistic script models leveraging recent advances in Neural Network training. These include recurrent sequence models with different hidden unit structure and Convolutional Neural Network models. Second, we propose integrating more lexical and linguistic information into events. Third, we propose incorporating discourse relations between spans of text into event co-occurrence models, either as output by an off-the-shelf discourse parser or learned automatically. Finally, we propose investigating the interface between models of event co-occurrence and coreference resolution, in particular by integrating script information into general coreference systems.
ML ID: 326
Statistical Script Learning with Multi-Argument Events
[Details] [PDF] [Poster]
Karl Pichotta and Raymond J. Mooney
In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2014), 220--229, Gothenburg, Sweden, April 2014.
Scripts represent knowledge of stereotypical event sequences that can aid text understanding. Initial statistical methods have been developed to learn probabilistic scripts from raw text corpora; however, they utilize a very impoverished representation of events, consisting of a verb and one dependent argument. We present a script learning approach that employs events with multiple arguments. Unlike previous work, we model the interactions between multiple entities in a script. Experiments on a large corpus using the task of inferring held-out events (the "narrative cloze evaluation") demonstrate that modeling multi-argument events improves predictive accuracy.
ML ID: 296
Schema acquisition from a single example
[Details] [PDF]
W. Ahn, W. F. Brewer and Raymond J. Mooney
Journal of Experimental Psychology: Learning, Memory, and Cognition, 18:391-412, 1992.
This study compares similarity-based learning (SBL) and explanation-based learning (EBL) approaches to schema acquisition. In SBL approaches, concept formation is based on similarity across multiple examples. However, these approaches seem to be appropriate when the learner cannot apply existing knowledge and when the concepts to be learned are nonexplanatory. EBL approaches assume that a schema can be acquired from even a single example by constructing an explanation of the example using background knowledge, and generalizing the resulting explanation. However, unlike the current EBL theories, Exp 1 showed significant EBL occurred only when the background information learned during the experiment was actively used by the Ss. Exp 2 showed the generality of EBL mechanisms across a variety of materials and test procedures.
ML ID: 212
Learning Plan Schemata From Observation: Explanation-Based Learning for Plan Recognition
[Details] [PDF]
Raymond J. Mooney
Cognitive Science, 14(4):483-509, 1990.
This article discusses how explanation-based learning of plan schemata from observation can improve performance of plan recognition. The GENESIS program is presented as an implemented system for narrative text understanding that learns schemata and improves its performance. Learned schemata allow GENESIS to use schema-based understanding techniques when interpreting events and thereby avoid the expensive search associated with plan-based understanding. Learned schemata also function as new concepts that can be used to cluster examples and index events in memory. In addition. experiments are reviewed which demonstrate that human subjects, like GENESIS, can learn a schema by observing, explaining, and generalizing a single specific instance presented in a narrative.
ML ID: 1
Schema Acquisition from One Example: Psychological Evidence for Explanation-Based Learning
[Details] [PDF]
W. Ahn, Raymond J. Mooney, W.F. Brewer and G.F. DeJong
In Proceedings of the Ninth Annual Conference of the Cognitive Science Society, 50-57, Seattle, WA, July 1987.
Recent explanation-based learning (EBL) models in AI allow a computer program to learn a schema by analyzing a single example. For example, GENESIS is an EBL system which learns a plan schema from a single specific instance presented in a narrative. Previous learning models in both AI and psychology have required multiple examples. This paper presents experimental evidence that people can learn a plan schema from a single narrative and that the learned schema agrees with that predicted by EBL. This evidence suggests that GENESIS, originally constructed as a machine learning system, can be interpreted as a psychological model of learning a complex schema from a single example.
ML ID: 207
Generalizing Explanations of Narratives into Schemata
[Details] [PDF]
Raymond J. Mooney
In Proceedings of the Third International Machine Learning Workshop, 126--128, New Brunswick, New Jersey, 1985.
This paper describes a natural language system which improves its performance through learning. The system processes short English narratives and from a single narrative acquires a new schema for a stereotypical set of actions. During the understanding process, the system constructs explanations for characters' actions in terms of the goals they were meant to achieve. If a character achieves a common goal in a novel way, it generalizes the set of actions used to achieve this goal into a new schema. The generalization process is a knowledge-based analysis of the narrative's causal structure which removes unnecessary details while maintaining the validity of the explanation. The resulting generalized set of actions is then stored as a new schema and used by the system to process narratives which were previously beyond its capabilities.
ML ID: 276
Learning Schemata for Natural Language Processing
[Details] [PDF]
Raymond J. Mooney and Gerald F. DeJong
In Proceedings of the Ninth International Joint Conference on Artificial Intelligence (IJCAI-85), 681-687, Los Angeles, CA, August 1985.
This paper describes a natural language system which improves its own performance through learning. The system processes short English narratives and is able to acquire, from a single narrative, a new schema for a stereotypical set of actions. During the understanding process, the system attempts to construct explanations for characters' actions in terms of the goals their actions were meant to achieve. When the system observes that a character has achieved an interesting goal in a novel way, it generalizes the set of actions they used to achieve this goal into a new schema. The generalization process is a knowledge-based analysis of the causal structure of the narrative which removes unnecessary details while maintaining the validity of the causal explanation. The resulting generalized set of actions is then stored as a new schema and used by the system to correctly process narratives which were previously beyond its capabilities.
ML ID: 205
Generalizing Explanations of Narratives into Schemata
[Details] [PDF]
Raymond J. Mooney
Masters Thesis, Department of Computer Science, University of Illinois at Urbana-Champaign, 1985.
This thesis describes a natural language system called GENESIS which improves its own performance through learning. The system processes short English narratives and is able to acquire, from a single narrative, a new schema for a stereotypical set of actions. During the understanding process, the system attempts to construct explanations for characters' actions in terms of the goals their actions were meant to achieve. When the system observes that a character in a narrative has achieved an interesting goal in a novel way, it generalizes the set of actions they used to achieve this goal into a new schema. The generalization process is a knowledge-based analysis of the causal structure of the narrative which removes unnecessary details while maintaining the validity of the causal explanation. The resulting generalized combination of actions is then stored as a new schema in the system's knowledge base. This new schema can then be used by the system to correctly process narratives which were previously beyond its capabilities.