Department of Computer Science

Machine Learning Research Group

University of Texas at Austin Artificial Intelligence Lab

Publications: Text Categorization and Clustering

The ability to categorize natural-language documents and web pages into known categories using supervised learning or to cluster them into meaningful new categories using unsupervised learning has important applications in information retrieval, information filtering, knowledge management, and recommender systems. Our research has focused on applications of text learning to recommender systems and on semi-supervised clustering of documents.
  1. Detecting Promotional Content in Wikipedia
    [Details] [PDF] [Slides]
    Shruti Bhosale and Heath Vinicombe and Raymond J. Mooney
    In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013), 1851--1857, Seattle, WA, October 2013.
  2. Spherical Topic Models
    [Details] [PDF] [Slides]
    Joseph Reisinger, Austin Waters, Bryan Silverthorn, and Raymond J. Mooney
    In Proceedings of the 27th International Conference on Machine Learning (ICML 2010), 2010.
  3. Multi-Prototype Vector-Space Models of Word Meaning
    [Details] [PDF] [Slides]
    Joseph Reisinger, Raymond J. Mooney
    In Proceedings of the 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-2010), 109-117, 2010.
  4. Spherical Topic Models
    [Details] [PDF]
    Joseph Reisinger, Austin Waters, Bryan Silverthorn, and Raymond Mooney
    In NIPS'09 workshop: Applications for Topic Models: Text and Beyond, 2009.
  5. Probabilistic Semi-Supervised Clustering with Constraints
    [Details] [PDF]
    Sugato Basu, Mikhail Bilenko, Arindam Banerjee and Raymond J. Mooney
    In O. Chapelle and B. Sch{"{o}}lkopf and A. Zien, editors, Semi-Supervised Learning, Cambridge, MA, 2006. MIT Press.
  6. Semi-supervised Clustering: Probabilistic Models, Algorithms and Experiments
    [Details] [PDF]
    Sugato Basu
    PhD Thesis, University of Texas at Austin, 2005.
  7. Model-based Overlapping Clustering
    [Details] [PDF]
    A. Banerjee, C. Krumpelman, S. Basu, Raymond J. Mooney and Joydeep Ghosh
    In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-05), 2005.
  8. A Probabilistic Framework for Semi-Supervised Clustering
    [Details] [PDF]
    Sugato Basu, Mikhail Bilenko, and Raymond J. Mooney
    In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004), 59-68, Seattle, WA, August 2004.
  9. Semi-supervised Clustering with Limited Background Knowledge
    [Details] [PDF]
    Sugato Basu
    In Proceedings of the Ninth AAAI/SIGART Doctoral Consortium, 979--980, San Jose, CA, July 2004.
  10. Active Semi-Supervision for Pairwise Constrained Clustering
    [Details] [PDF]
    Sugato Basu, Arindam Banerjee, and Raymond J. Mooney
    In Proceedings of the 2004 SIAM International Conference on Data Mining (SDM-04), April 2004.
  11. Semi-supervised Clustering: Learning with Limited User Feedback
    [Details] [PDF]
    Sugato Basu
    Technical Report, Cornell University, 2004.
  12. Semi-supervised Clustering by Seeding
    [Details] [PDF]
    Sugato Basu, Arindam Banerjee, and Raymond J. Mooney
    In Proceedings of 19th International Conference on Machine Learning (ICML-2002), 19-26, 2002.
  13. Content-Based Book Recommending Using Learning for Text Categorization
    [Details] [PDF]
    Raymond J. Mooney and Loriene Roy
    In Proceedings of the Fifth ACM Conference on Digital Libraries, 195-204, San Antonio, TX, June 2000.
  14. Content-Based Book Recommending Using Learning for Text Categorization
    [Details] [PDF]
    Raymond J. Mooney and Loriene Roy
    In Proceedings of the SIGIR-99 Workshop on Recommender Systems: Algorithms and Evaluation, Berkeley, CA, August 1999.
  15. Using HTML Structure and Linked Pages to Improve Learning for Text Categorization
    [Details] [PDF]
    Michael B. Cline
    Technical Report AI 98-270, Department of Computer Sciences, University of Texas at Austin, Austin, TX, May 1999. Undergraduate Honors Thesis.
  16. Book Recommending Using Text Categorization with Extracted Information
    [Details] [PDF]
    Raymond J. Mooney, Paul N. Bennett, and Loriene Roy
    In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-98)"-REC-WKSHP98, year="1998, 70-74, Madison, WI, 1998.
  17. Text Categorization Through Probabilistic Learning: Applications to Recommender Systems
    [Details] [PDF]
    Paul N. Bennett
    1998. Honors thesis, Department of Computer Sciences, The University of Texas at Austin.