- Relational Learning of Pattern-Match Rules for Information Extraction
Mary Elaine Califf and Raymond J. Mooney
Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI-99), Orlando, FL, pp. 328-334, July 1999.
Paper ID: 94
Category: Information Extraction, Inductive Logic Programming, Natural Language Learning
Information extraction is a form of shallow text processing that locates a specified set of relevant items in a natural-language document. Systems for this task require significant domain-specific knowledge and are time-consuming and difficult to build by hand, making them a good application for machine learning. This paper presents a system, Rapier, that takes pairs of sample documents and filled templates and induces pattern-match rules that directly extract fillers for the slots in the template. Rapier employs a bottom-up learning algorithm which incorporates techniques from several inductive logic programming systems and acquires unbounded patterns that include constraints on the words, part-of-speech tags, and semantic classes present in the filler and the surrounding text. We present encouraging experimental results on two domains.

mooney@cs.utexas.edu