Collective Information Extraction with Relational Markov Networks (2004)
Most information extraction (IE) systems treat separate potential extractions as independent. However, in many cases, considering influences between different potential extractions could improve overall accuracy. Statistical methods based on undirected graphical models, such as conditional random fields (CRFs), have been shown to be an effective approach to learning accurate IE systems. We present a new IE method that employs Relational Markov Networks (a generalization of CRFs), which can represent arbitrary dependencies between extractions. This allows for ``collective information extraction'' that exploits the mutual influence between possible extractions. Experiments on learning to extract protein names from biomedical text demonstrate the advantages of this approach.
In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), pp. 439-446, Barcelona, Spain, July 2004.

Razvan Bunescu Ph.D. Alumni bunescu [at] ohio edu
Raymond J. Mooney Faculty mooney [at] cs utexas edu