Diverse Ensembles for Active Learning (2004)
Query by Committee is an effective approach to selective sampling in which disagreement amongst an ensemble of hypotheses is used to select data for labeling. Query by Bagging and Query by Boosting are two practical implementations of this approach that use Bagging and Boosting, respectively, to build the committees. For effective active learning, it is critical that the committee be made up of consistent hypotheses that are very different from each other. DECORATE is a recently developed method that directly constructs such diverse committees using artificial training data. This paper introduces Active-Decorate, which uses Decorate committees to select good training examples. Extensive experimental results demonstrate that, in general, Active-DECORATE outperforms both Query by Bagging and Query by Boosting.
In Proceedings of 21st International Conference on Machine Learning (ICML-2004), pp. 584-591, Banff, Canada, July 2004.

Prem Melville Ph.D. Alumni pmelvi [at] us ibm com
Raymond J. Mooney Faculty mooney [at] cs utexas edu