Semi-Supervised Learning
In many learning tasks, there is a large supply of unlabeled data but insufficient labeled data since it can be expensive to generate. Semi-supervised learning combines labeled and unlabeled data during training to improve performance. Semi-supervised learning is applicable to both classification and clustering. In supervised classification, there is a known, fixed set of categories and category-labeled training data is used to induce a classification function. In semi-supervised classification, training also exploits additional unlabeled data, frequently resulting in a more accurate classification function. In semi-supervised clustering, some labeled data is used along with the unlabeled data to obtain a better clustering.
Dan Garrette Ph.D. Student dhg [at] cs utexas edu
Elad Liebman Ph.D. Student eladlieb [at] cs utexas edu
     [Expand to show all 22][Minimize]
Weakly-Supervised Bayesian Learning of a CCG Supertagger 2014
Dan Garrette, Chris Dyer, Jason Baldridge, and Noah A. Smith, In Proceedings of the Eighteenth Conference on Computational Natural Language Learning (CoNLL-2014), pp. 141--150, Baltimore, MD, June 2014.
Learning a Part-of-Speech Tagger from Two Hours of Annotation 2013
Dan Garrette, Jason Baldridge , Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT-13) (2013), pp. 138--147.
Real-World Semi-Supervised Learning of POS-Taggers for Low-Resource Languages 2013
Dan Garrette, Jason Mielens, and Jason Baldridge , To Appear Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL-2013) (2013), pp. 583--592.
Type-Supervised Hidden Markov Models for Part-of-Speech Tagging with Incomplete Tag Dictionaries 2012
Dan Garrette and Jason Baldridge, In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2012), pp. 821--831, Jeju, Korea, July 2012.
Semi-supervised graph clustering: a kernel approach 2009
Brian Kulis, Sugato Basu, Inderjit Dhillon, and Raymond Mooney, Machine Learning Journal, Vol. 74, 1 (2009), pp. 1-22.
Watch, Listen & Learn: Co-training on Captioned Images and Videos 2008
Sonal Gupta, Joohyun Kim, Kristen Grauman and Raymond Mooney, In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), pp. 457--472, Antwerp Belgium, September 2008.
Semi-Supervised Learning for Semantic Parsing using Support Vector Machines 2007
Rohit J. Kate and Raymond J. Mooney, In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Short Papers (NAACL/HLT-2007), pp. 81--84, Rochester...
Learnable Similarity Functions and Their Application to Record Linkage and Clustering 2006
Mikhail Bilenko, PhD Thesis, Department of Computer Sciences, University of Texas at Austin. 136 pages.
Probabilistic Semi-Supervised Clustering with Constraints 2006
Sugato Basu, Mikhail Bilenko, Arindam Banerjee and Raymond J. Mooney, In Semi-Supervised Learning, O. Chapelle and B. Sch{"{o}}lkopf and A. Zien (Eds.), Cambridge, MA 2006. MIT Press.
Semi-supervised Clustering: Probabilistic Models, Algorithms and Experiments 2005
Sugato Basu, PhD Thesis, University of Texas at Austin.
Semi-supervised Graph Clustering: A Kernel Approach 2005
B. Kulis, S. Basu, I. Dhillon and Raymond J. Mooney, In Proceedings of the 22nd International Conference on Machine Learning, pp. 457--464, Bonn, Germany, August 2005. (Distinguished Student Paper Award).
A Comparison of Inference Techniques for Semi-supervised Clustering with Hidden Markov Random Fields 2004
Mikhail Bilenko and Sugato Basu, In Proceedings of the ICML-2004 Workshop on Statistical Relational Learning and its Connections to Other Fields (SRL-2004), Banff, Canada, July 2004.
A Probabilistic Framework for Semi-Supervised Clustering 2004
Sugato Basu, Mikhail Bilenko, and Raymond J. Mooney, In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004), pp. 59-68, Seattle, WA, August 2004.
Active Semi-Supervision for Pairwise Constrained Clustering 2004
Sugato Basu, Arindam Banerjee, and Raymond J. Mooney, In Proceedings of the 2004 SIAM International Conference on Data Mining (SDM-04), April 2004.
Integrating Constraints and Metric Learning in Semi-Supervised Clustering 2004
Mikhail Bilenko, Sugato Basu, and Raymond J. Mooney, In Proceedings of 21st International Conference on Machine Learning (ICML-2004), pp. 81-88, Banff, Canada, July 2004.
Learnable Similarity Functions and Their Applications to Clustering and Record Linkage 2004
Mikhail Bilenko, In Proceedings of the Ninth AAAI/SIGART Doctoral Consortium, pp. 981--982, San Jose, CA, July 2004.
Semi-supervised Clustering with Limited Background Knowledge 2004
Sugato Basu, In Proceedings of the Ninth AAAI/SIGART Doctoral Consortium, pp. 979--980, San Jose, CA, July 2004.
Semi-supervised Clustering: Learning with Limited User Feedback 2004
Sugato Basu, Technical Report, Cornell University.
Semisupervised Clustering for Intelligent User Management 2004
Sugato Basu, Mikhail Bilenko, and Raymond J. Mooney, In Proceedings of the IBM Austin Center for Advanced Studies 5th Annual Austin CAS Conference, Austin, TX, February 2004.
Comparing and Unifying Search-Based and Similarity-Based Approaches to Semi-Supervised Clustering 2003
Sugato Basu, Mikhail Bilenko, and Raymond J. Mooney, In Proceedings of the ICML-2003 Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining, pp. 42-49, Washington, DC 2003.
Learnable Similarity Functions and Their Applications to Record Linkage and Clustering 2003
Mikhail Bilenko, unpublished. Doctoral Dissertation Proposal, University of Texas at Austin.
Semi-supervised Clustering by Seeding 2002
Sugato Basu, Arindam Banerjee, and Raymond J. Mooney, In Proceedings of 19th International Conference on Machine Learning (ICML-2002), pp. 19-26 2002.