Active Multitask Learning Using Both Latent and Supervised Shared Topics (2014)
Ayan Acharya, Raymond J. Mooney, and Joydeep Ghosh
Multitask learning (MTL) via a shared representation has been adopted to alleviate problems with sparsity of labeled data across different learning tasks. Active learning, on the other hand, reduces the cost of labeling examples by making informative queries over an unlabeled pool of data. Therefore, a unification of both of these approaches can potentially be useful in settings where labeled information is expensive to obtain but the learning tasks or domains have some common characteristics. This paper introduces two such models -- Active Doubly Supervised Latent Dirichlet Allocation (Act-DSLDA) and its non-parametric variation (Act-NPDSLDA) that integrate MTL and active learning in the same framework. These models make use of both latent and supervised shared topics to accomplish multitask learning. Experimental results on both document and image classification show that integrating MTL and active learning along with shared latent and supervised topics is superior to other methods which do not employ all of these components.
In Proceedings of the 2014 SIAM International Conference on Data Mining (SDM14), Philadelphia, Pennsylvania, April 2014.

Slides (PDF)
Raymond J. Mooney Faculty mooney [at] cs utexas edu