Semantic Lexicon Acquisition for Learning Natural Language Interfaces (1998)
A long-standing goal for the field of artificial intelligence is to enable computer understanding of human languages. A core requirement in reaching this goal is the ability to transform individual sentences into a form better suited for computer manipulation. This ability, called semantic parsing, requires several knowledge sources, such as a grammar, lexicon, and parsing mechanism.
Building natural language parsing systems by hand is a tedious, error-prone undertaking. We build on previous research in automating the construction of such systems using machine learning techniques. The result is a combined system that learns semantic lexicons and semantic parsers from one common set of training examples. The input required is a corpus of sentence/representation pairs, where the representations are in the output format desired. A new system, Wolfie, learns semantic lexicons to be used as background knowledge by a previously developed parser acquisition system, Chill. The combined system is tested on a real world domain of answering database queries. We also compare this combination to a combination of Chill with a previously developed lexicon learner, demonstrating superior performance with our system. In addition, we show the ability of the system to learn to process natural languages other than English. Finally, we test the system on an alternate sentence representation, and on a set of large, artificial corpora with varying levels of ambiguity and synonymy.
One difficulty in using machine learning methods for building natural language interfaces is building the required annotated corpus. Therefore, we also address this issue by using active learning to reduce the number of training examples required by both Wolfie and Chill. Experimental results show that the number of examples needed to reach a given level of performance can be significantly reduced with this method.
PhD Thesis, Department of Computer Sciences, University of Texas at Austin. 101 pages. Also appears as Technical Report AI 99-278, Artificial Intelligence Lab, University of Texas at Austin.

Cynthia Thompson Ph.D. Alumni cindi [at] cs utah edu