Relational Learning of Pattern-Match Rules for Information Extraction (1999)
Information extraction is a form of shallow text processing that locates a specified set of relevant items in a natural-language document. Systems for this task require significant domain-specific knowledge and are time-consuming and difficult to build by hand, making them a good application for machine learning. This paper presents a system, Rapier, that takes pairs of sample documents and filled templates and induces pattern-match rules that directly extract fillers for the slots in the template. Rapier employs a bottom-up learning algorithm which incorporates techniques from several inductive logic programming systems and acquires unbounded patterns that include constraints on the words, part-of-speech tags, and semantic classes present in the filler and the surrounding text. We present encouraging experimental results on two domains.
In Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI-99), pp. 328-334, Orlando, FL, July 1999.

Mary Elaine Califf Ph.D. Alumni mecaliff [at] ilstu edu
Raymond J. Mooney Faculty mooney [at] cs utexas edu