A Study of Machine Reading from Multiple Texts

Reference: P. Clark, J. Thompson. A Study of Machine Reading from Multiple Texts. Submitted to the AAAI Spring Symposium on Learning by Reading and Learning to Read, 2009.

Abstract: A system that seeks to build a semantically coherent representation from multiple texts requires (at least) three things: a representation language that is sufficiently expressive to capture the information conveyed by the text; a natural language engine that can interpret text and generate semantic representations in that language with reasonable reliability; and a knowledge integration capability that can integrate information from different texts and from background knowledge into a coherent whole. In this paper we present a case study of these requirements for interpreting four different paragraphs of text (from different sources), each describing how a two-stroke combustion engine behaves. We identify the challenges involved in meeting these requirements and how they might be addressed. One key feature that emerges is the need for extensive background knowledge to guide the interpretation, disambiguate, and fill in gaps. The resulting contribution of this paper is a deeper understanding of the overall machine reading task.

Slide Presentation (PowerPoint): (Note: The presentation contains significantly new material compared with the paper)