CHILL

CHILL (Constructive Heuristics Induction for Language Learning) is a general approach to the problem of inducing natural language parsers. Given a suitably annotated corpus, CHILL produces a parser for mapping subsequent sentences into representations. It treats parser induction as the problem of learning rules to control the actions of a shift-reduce parser expressed as a Prolog program. Control rules are induced by utilizing a novel Inductive Logic Programming (ILP) algorithm, namely CHILLIN, that has been developed to handle the issues arising in the natural-language control-rule domain.

The CHILL Prolog code is available via anonymous ftp. See the INDEX file there for details.

Pointers to papers on CHILL and CHILLIN can be found on our Natural-Language Learning and ILP research page. Below are some standard references (click on the open book image).

Learning to Parse Database Queries using Inductive Logic Programming
John M. Zelle and Raymond J. Mooney
Proceedings of the Thirteenth National Conference on Aritificial Intelligence, pp. 1050-1055, Portland, OR, August, 1996. (AAAI-96)

This paper presents recent work using the CHILL parser acquisition system to automate the construction of a natural-language interface for database queries. CHILL treats parser acquisition as the learning of search-control rules within a logic program representing a shift-reduce parser and uses techniques from Inductive Logic Programming to learn relational control knowledge. Starting with a general framework for constructing a suitable logical form, CHILL is able to train on a corpus comprising sentences paired with database queries and induce parsers that map subsequent sentences directly into executable queries. Experimental results with a complete database-query application for U.S. geography show that CHILL is able to learn parsers that outperform a pre-existing, hand-crafted counterpart. These results demonstrate the ability of a corpus-based system to produce more than purely syntactic representations. They also provide direct evidence of the utility of an empirical approach at the level of a complete natural language application.

An Inductive Logic Programming Method for Corpus-based Parser Construction
John M. Zelle and Raymond J. Mooney
Submitted to Computational Lingusitics

Empirical methods for building natural language systems has become an important area of research in recent years. Most current approaches are based on propositional learning algorithms and have been applied to the problem of acquiring broad-coverage parsers for relatively shallow (syntactic) representations. This paper outlines an alternative empirical approach based on techniques from a subfield of machine learning known as Inductive Logic Programming (ILP). ILP algorithms, which learn relational (first-order) rules, are used in a parser acquisition system called CHILL that learns rules to control the behavior of a traditional shift-reduce parser. Using this approach, CHILL is able to learn parsers for a variety of different types of analyses, from traditional syntax trees to more meaning-oriented case-role and database query forms. Experimental evidence shows that CHILL performs comparably to propositional learning systems on similar tasks, and is able to go beyond the broad-but-shallow paradigm and learn mappings directly from sentences into useful semantic representations. In a complete database-query application, parsers learned by CHILL outperform an existing hand-crafted system, demonstrating the promise of empricial techniques for automating the construction certain NLP systems.

Using Inductive Logic Programming to Automate the Construction of Natural Language Parsers
John M. Zelle
Ph.D. Thesis, Department of Computer Sciences, University of Texas at Austin, August, 1995. (Technical Report AI96-249)

Designing computer systems to understand natural language input is a difficult task. In recent years there has been considerable interest in corpus-based methods for constructing natural language parsers. These empirical approaches replace hand-crafted grammars with linguistic models acquired through automated training over language corpora. A common thread among such methods to date is the use of propositional or probablistic representations for the learned knowledge. This dissertation presents an alternative approach based on techniques from a subfield of machine learning known as inductive logic programming (ILP). ILP, which investigates the learning of relational (first-order) rules, provides an empirical method for acquiring knowledge within traditional, symbolic parsing frameworks.
This dissertation details the architecture, implementation and evaluation of CHILL a computer system for acquiring natural language parsers by training over corpora of parsed text. CHILL treats language acquisition as the learning of search-control rules within a logic program that implements a shift-reduce parser. Control rules are induced using a novel ILP algorithm which handles difficult issues arising in the induction of search-control heuristics. Both the control-rule framework and the induction algorithm are crucial to CHILL's success.
The main advantage of CHILL over propositional counterparts is its flexibility in handling varied representations. CHILL has produced parsers for various analyses including case-role mapping, detailed syntactic parse trees, and a logical form suitable for expressing first-order database queries. All of these tasks are accomplished within the same framework, using a single, general learning method that can acquire new syntactic and semantic categories for resolving ambiguities.
Experimental evidence from both aritificial and real-world corpora demonstrate that CHILL learns parsers as well or better than previous artificial neural network or probablistic approaches on comparable tasks. In the database query domain, which goes beyond the scope of previous empirical approaches, the learned parser outperforms an existing hand-crafted system. These results support the claim that ILP techniques as implemented in CHILL represent a viable alternative with significant potential advantages over neural-network, propositional, and probablistic approaches to empirical parser construction.

rupert@cs.utexas.edu