Text data mining concerns the application of data mining (knowledge discovery
in databases, KDD) to unstructured textual data. Our work focuses on using information extraction
to first extract a structured
database from a corpus of natural language texts and then discovering patterns
in the resulting database using traditional KDD tools. It also concerns record linkage
, a form of data-cleaning that identifies
equivalent but textually distinct items in the extracted data prior to mining.
It is also related to our research on natural language learning
. Our recent work has focused on text mining for
This research was formerly supported by the National Science Foundation through
grant IIS-0117308 from the "Information and Data Management" Program.