Department of Computer Science

Machine Learning Research Group

University of Texas at Austin Artificial Intelligence Lab

Publications: 1997

  1. Integrating Abduction and Induction in Machine Learning
    [Details] [PDF]
    Raymond J. Mooney
    In Working Notes of the IJCAI-97 Workshop on Abduction and Induction in AI, 37--42, Nagoya, Japan, August 1997.
    This paper discusses the integration of traditional abductive and inductive reasoning methods in the development of machine learning systems. In particular, the paper discusses our recent work in two areas: 1) The use of traditional abductive methods to propose revisions during theory refinement, where an existing knowledge base is modified to make it consistent with a set of empirical data; and 2) The use of inductive learning methods to automatically acquire from examples a diagnostic knowledge base used for abductive reasoning.
    ML ID: 79
  2. Relational Learning Techniques for Natural Language Information Extraction
    [Details] [PDF]
    Mary Elaine Califf
    1997. Ph.D. proposal, Department of Computer Sciences, University of Texas at Austin.
    The recent growth of online information available in the form of natural language documents creates a greater need for computing systems with the ability to process those documents to simplify access to the information. One type of processing appropriate for many tasks is information extraction, a type of text skimming that retrieves specific types of information from text. Although information extraction systems have existed for two decades, these systems have generally been built by hand and contain domain specific information, making them difficult to port to other domains. A few researchers have begun to apply machine learning to information extraction tasks, but most of this work has involved applying learning to pieces of a much larger system. This paper presents a novel rule representation specific to natural language and a learning system, RAPIER, which learns information extraction rules. RAPIER takes pairs of documents and filled templates indicating the information to be extracted and learns patterns to extract fillers for the slots in the template. This proposal presents initial results on a small corpus of computer-related job postings with a preliminary version of RAPIER. Future research will involve several enhancements to RAPIER as well as more thorough testing on several domains and extension to additional natural language processing tasks. We intend to extend the rule representation and algorithm to allow for more types of constraints than are currently supported. We also plan to incorporate active learning, or sample selection, methods, specifically query by committee, into RAPIER. These methods have the potential to substantially reduce the amount of annotation required. We will explore the issue of distinguishing relevant and irrelevant messages, since currently RAPIER only extracts from the any messages given to it, assuming that all are relevant. We also intend to run much larger tests with RAPIER on multiple domains including the terrorism domain from the third and fourth Message Uncderstanding Conferences, which will allow comparison against other systems. Finally, we plan to demonstrate the generality of RAPIER`s representation and algorithm by applying it to other natural language processing tasks such as word sense disambiguation.
    ML ID: 78
  3. Learning to Improve both Efficiency and Quality of Planning
    [Details] [PDF]
    Tara A. Estlin and Raymond J. Mooney
    In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (IJCAI-97), 1227-1232, Nagoya, Japan, 1997.
    Most research in learning for planning has concentrated on efficiency gains. Another important goal is improving the quality of final plans. Learning to improve plan quality has been examined by a few researchers, however, little research has been done learning to improve both efficiency and quality. This paper explores this problem by using the SCOPE learning system to acquire control knowledge that improves on both of these metrics. Since SCOPE uses a very flexible training approach, we can easily focus it's learning algorithm to prefer search paths that are better for particular evaluation metrics. Experimental results show that SCOPE can significantly improve both the quality of final plans and overall planning efficiency.
    ML ID: 77
  4. Applying ILP-based Techniques to Natural Language Information Extraction: An Experiment in Relational Learning
    [Details] [PDF]
    Mary Elaine Califf and Raymond J. Mooney
    In Workshop Notes of the IJCAI-97 Workshop on Frontiers of Inductive Logic Programming, 7--11, Nagoya, Japan, August 1997.
    Information extraction systems process natural language documents and locate a specific set of relevant items. Given the recent success of empirical or corpus-based approaches in other areas of natural language processing, machine learning has the potential to significantly aid the development of these knowledge-intensive systems. This paper presents a system, RAPIER, that takes pairs of documents and filled templates and induces pattern-match rules that directly extract fillers for the slots in the template. The learning algorithm incorporates techniques from several inductive logic programming systems and learns unbounded patterns that include constraints on the words and part-of-speech tags surrounding the filler. Encouraging results are presented on learning to extract information from computer job postings from the newsgroup misc.jobs.offered.
    ML ID: 76
  5. Learning to Parse Natural Language Database Queries into Logical Form
    [Details] [PDF]
    Cynthia A. Thompson, Raymond J. Mooney, and Lappoon R. Tang
    In Proceedings of the ML-97 Workshop on Automata Induction, Grammatical Inference, and Language Acquisition, Nashville, TN, July 1997.
    For most natural language processing tasks, a parser that maps sentences into a semantic representation is significantly more useful than a grammar or automata that simply recognizes syntactically well-formed strings. This paper reviews our work on using inductive logic programming methods to learn deterministic shift-reduce parsers that translate natural language into a semantic representation. We focus on the task of mapping database queries directly into executable logical form. An overview of the system is presented followed by recent experimental results on corpora of Spanish geography queries and English job-search queries.
    ML ID: 75
  6. Relational Learning of Pattern-Match Rules for Information Extraction
    [Details] [PDF]
    Mary Elaine Califf and Raymond J. Mooney
    In Proceedings of the ACL Workshop on Natural Language Learning, 9-15, Madrid, Spain, July 1997.
    Information extraction systems process natural language documents and locate a specific set of relevant items. Given the recent success of empirical or corpus-based approaches in other areas of natural language processing, machine learning has the potential to significantly aid the development of these knowledge-intensive systems. This paper presents a system, RAPIER, that takes pairs of documents and filled templates and induces pattern-match rules that directly extract fillers for the slots in the template. The learning algorithm incorporates techniques from several inductive logic programming systems and learns unbounded patterns that include constraints on the words and part-of-speech tags surrounding the filler. Encouraging results are presented on learning to extract information from computer job postings from the newsgroup misc.jobs.offered.
    ML ID: 74
  7. Learning Parse and Translation Decisions From Examples With Rich Context
    [Details] [PDF]
    Ulf Hermjakob and Raymond J. Mooney
    In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (ACL'97/EACL'97), 482-489, July 1997.
    This paper presents a knowledge and context-based system for parsing and translating natural language and evaluates it on sentences from the Wall Street Journal. Applying machine learning techniques, the system uses parse action examples acquired under supervision to generate a deterministic shift-reduce parser in the form of a decision structure. It relies heavily on context, as encoded in features which describe the morpholgical, syntactical, semantical and other aspects of a given parse state.
    ML ID: 73
  8. Learning Parse and Translation Decisions From Examples With Rich Context
    [Details] [PDF]
    Ulf Hermjakob
    PhD Thesis, Department of Computer Sciences, The University of Texas at Austin, Austin, TX, May 1997. 175 pages. Technical Report UT-AI97-261.
    The parsing of unrestricted text, with its enormous lexical and structural ambiguity, still poses a great challenge in natural language processing. The difficulties with traditional approaches, which try to master the complexity of parse grammars with hand-crafted rules, have led to a trend towards more empirical techniques.

    We therefore propose a system for parsing and translating natural language that learns from examples and uses some background knowledge.
    As our parsing model we choose a deterministic shift-reduce type parser that integrates part-of-speech tagging and syntactic and semantic processing, which not only makes parsing very efficient, but also assures transparency during the supervised example acquisition.
    Applying machine learning techniques, the system uses parse action examples to generate a parser in the form of a decision structure, a generalization of decision trees.
    To learn good parsing and translation decisions, our system relies heavily on context, as encoded in currently 205 features describing the morphological, syntactical and semantical aspects of a given parse state. Compared with recent probabilistic systems that were trained on 40,000 sentences, our system relies on more background knowledge and a deeper analysis, but radically fewer examples, currently 256 sentences.

    We test our parser on lexically limited sentences from the Wall Street Journal and achieve accuracy rates of 89.8% for labeled precision, 98.4% for part of speech tagging and 56.3% of test sentences without any crossing brackets. Machine translations of 32 Wall Street Journal sentences to German have been evaluated by 10 bilingual volunteers and been graded as 2.4 on a 1.0 (best) to 6.0 (worst) scale for both grammatical correctness and meaning preservation. The translation quality was only minimally better (2.2) when starting each translation with the correct parse tree, which indicates that the parser is quite robust and that its errors have only a moderate impact on final trans- lation quality. These parsing and translation results already compare well with other systems and, given the relatively small training set and amount of overall knowledge used so far, the results suggest that our system Contex can break previous accuracy ceilings when scaled up further.

    ML ID: 72
  9. An Inductive Logic Programming Method for Corpus-based Parser Construction
    [Details] [PDF]
    John M. Zelle and Raymond J. Mooney
    January 1997. Unpublished Technical Note.
    Empirical methods for building natural language systems has become an important area of research in recent years. Most current approaches are based on propositional learning algorithms and have been applied to the problem of acquiring broad-coverage parsers for relatively shallow (syntactic) representations. This paper outlines an alternative empirical approach based on techniques from a subfield of machine learning known as Inductive Logic Programming (ILP). ILP algorithms, which learn relational (first-order) rules, are used in a parser acquisition system called CHILL that learns rules to control the behavior of a traditional shift-reduce parser. Using this approach, CHILL is able to learn parsers for a variety of different types of analyses, from traditional syntax trees to more meaning-oriented case-role and database query forms. Experimental evidence shows that CHILL performs comparably to propositional learning systems on similar tasks, and is able to go beyond the broad-but-shallow paradigm and learn mappings directly from sentences into useful semantic representations. In a complete database-query application, parsers learned by CHILL outperform an existing hand-crafted system, demonstrating the promise of empricial techniques for automating the construction certain NLP systems.
    ML ID: 71
  10. Parameter Revision Techniques for Bayesian Networks with Hidden Variables: An Experimental Comparison
    [Details] [PDF]
    Sowmya Ramachandran and Raymond J. Mooney
    January 1997. Unpublished Technical Note.
    Learning Bayesian networks inductively in the presence of hidden variables is still an open problem. Even the simpler task of learning just the conditional probabilities on a Bayesian network with hidden variables is not completely solved. In this paper, we present an approach that learns the parameters of a Bayesian network composed of noisy-or and noisy-and nodes by using a gradient descent back-propagation approach similar to that used to train neural networks. For the task of causal inference, it has the advantage of being able to learn in the presence of hidden variables. We compare the performance of this approach with the adaptive probabilistic networks technique on a real-world classification problem in molecular biology, and show that our approach trains faster and learns networks with higher classification accuracy.
    ML ID: 70
  11. Semantic Lexicon Acquisition for Learning Parsers
    [Details] [PDF]
    Cynthia A. Thompson and Raymond J. Mooney
    1997. Submitted for review.
    This paper describes a system, WOLFIE (WOrd Learning From Interpreted Examples), that learns a semantic lexicon from a corpus of sentences paired with representations of their meaning. The lexicon learned consists of words paired with representations of their meaning, and allows for both synonymy and polysemy. WOLFIE is part of an integrated system that learns to parse novel sentences into their meaning representations. Experimental results are presented that demonstrate WOLFIE's ability to learn useful lexicons for a realistic domain. The lexicons learned by WOLFIE are also compared to those learned by another lexical acquisition system, that of Siskind (1996).
    ML ID: 69