Applying ILP-based Techniques to Natural Language Information Extraction: An Experiment in Relational Learning (1997)
Information extraction systems process natural language documents and locate a specific set of relevant items. Given the recent success of empirical or corpus-based approaches in other areas of natural language processing, machine learning has the potential to significantly aid the development of these knowledge-intensive systems. This paper presents a system, RAPIER, that takes pairs of documents and filled templates and induces pattern-match rules that directly extract fillers for the slots in the template. The learning algorithm incorporates techniques from several inductive logic programming systems and learns unbounded patterns that include constraints on the words and part-of-speech tags surrounding the filler. Encouraging results are presented on learning to extract information from computer job postings from the newsgroup misc.jobs.offered.
View:
PDF, PS
Citation:
In Workshop Notes of the IJCAI-97 Workshop on Frontiers of Inductive Logic Programming, pp. 7--11, Nagoya, Japan, August 1997.
Bibtex:

Mary Elaine Califf Ph.D. Alumni mecaliff [at] ilstu edu
Raymond J. Mooney Faculty mooney [at] cs utexas edu