Semisupervised Clustering for Intelligent User Management (2004)
Grouping users automatically based on their system usage can be beneficial in an autonomic computing environment. Clustering algorithms can generate meaningful user groups that provide important insights to system administrators about user profiles and group policies. In particular, if a small amount of supervision is provided by the administrator to the clustering process, semi-supervised clustering algorithms can use this supervision to generate clusters which are more useful for user management. In this work, we demonstrate the utility of semi-supervised clustering in intelligent user management. We collect publicly available system usage data of users in a university computing environment, and cluster the users using semi-supervised hierarchical agglomerative clustering based on the profile of the processes they run. Initial supervision is provided in the form of a few users running a specific process. Semi-supervised clustering gives us more meaningful clusters than unsupervised clustering in this domain, demonstrating that our technique can find interesting and useful groups in data with minimal user intervention.
View:
PDF, PS
Citation:
In Proceedings of the IBM Austin Center for Advanced Studies 5th Annual Austin CAS Conference, Austin, TX, February 2004.
Bibtex:

Sugato Basu Ph.D. Alumni sugato [at] cs utexas edu
Mikhail Bilenko Ph.D. Alumni mbilenko [at] microsoft com
Raymond J. Mooney Faculty mooney [at] cs utexas edu