Below are brief outlines of some projects that I have recently worked on, along with some applications.
Spectral clustering partitions the nodes of a graph by minimizing one of several graph cut objectives; it has received a significant amount of attention recently and is used in numerous applications. I worked on Graclus, the first algorithm for optimizing spectral clustering objectives that does not use any eigenvector computation. This is possible due to a mathematical connection between spectral clustering and weighted kernel k-means. Our resulting multilevel algorithm outperforms the best existing spectral methods in terms of quality, speed, and memory usage. Applications include fast image segmentation, speech segmentation, and gene network clustering.
Graclus is available for download here.
Relevant papers:
Bregman matrix divergences are natural nearness measures for matrices; examples include the squared Frobenius norm distance, the von Neumann divergence, and the LogDet matrix divergence. We have demonstrated new properties of Bregman matrix divergences which lead to efficient algorithms for various matrix nearness problems.
One of the key applications of this work is metric learning, the problem of learning a distance function between data objects. Our approach is more scalable than existing techniques, can be applied to high-dimensional data, and can incorporate locality-sensitive hashing for sublinear-time similarity searches. This work has been applied to automated software bug classification for the Clarify system and learning image similarity for fast image search. With image search, we have applied metric learning to human body pose estimation, object classification, and feature indexing for 3-d reconstruction. The figure above shows our method applied to human body pose estimation over a database of 500,000 images.
NEW!! Our metric learning software is available here.
Relevant papers: