Knowledge Base Population using Stacked Ensembles of Information Extractors (2015)
Vidhoon Viswanathan
The performance of relation extractors plays a significant role in automatic creation of knowledge bases from web corpus. Using automated systems to create knowledge bases from web is known as Knowledge Base Population. Text Analysis Conference conducts English Slot Filling (ESF) and Slot Filler Validation (SFV) tasks as part of its KBP track to promote research in this area. Slot Filling systems are developed to do relation extraction for specific relation and entity types. Several participating universities have built Slot Filling systems addressing different aspects employing different algorithms and techniques for these tasks.

In this thesis, we investigate the use of ensemble learning to combine the output of existing individual Slot Filling systems. We are the first to employ Stacking, a type of ensemble learning algorithm for the task of ensembling Slot Filling systems for the KBP ESF and SFV tasks. Our approach builds an ensemble classi- fier that learns to meaningfully combine output from different Slot Filling systems and predict the correctness of extractions. Our experimental evaluation proves that Stacking is useful for ensembling SF systems. We demonstrate new state-of-the-art results for KBP ESF task. Our proposed system achieves an F1 score of 47.

Given the complexity of developing Slot Filling systems from scratch, our promising results indicate that performance on Slot Filling tasks can be increased by ensembling existing systems in shorter timeframe. Our work promotes research and investigation into other methods for ensembling Slot Filling systems.

Masters Thesis, Department of Computer Science, The University of Texas at Austin.