Distance Metric Learning Software |
ITML
(Version 1.1)
is a Matlab implementation of the Information
Theoretic Metric Learning algorithm. Metric learning involves
finding a suitable metric for a given set of data-points with
side-information regarding distances between few datapoints. ITML
characterizes the metric using a Mahalanobis distance function and
learns the associated parameters using Bregman's cyclic projection
algorithm.
|
Graph Clustering Software |
Graclus
(Version 1.0 & Version 1.1) is a fast graph clustering software that computes normalized cut and ratio association for a given graph without any eigenvector computation. This is possible because we establish a mathematical equivalence between general cut or association objectives (including normalized cut and ratio association) and weighted kernel k-means objective. One important implication of this equivalence is that we can run a k-means type of iterative algorithm to minimize general cut or association objectives. Therefore unlike spectral methods, our algorithm totally avoids time-consuming eigenvector computation. We embed weighted kernel k-means algorithm in a multilevel framework and develope this fast software for graph clustering.
|
Co-Clustering Software |
Co-cluster
(Version 1.1) is a C++ program which implements three co-clustering algorithms:
information-theoretic co-clustering algorithm and two types of
minimum sum-squared residue co-clustering algorithms.
In our implementation, all the algorithms have the ping-pong structure,
i.e., a batch algorithm followed by corresponding chain of first variations.
Each algorithm also has five variations,
based on in what order to update the row or column centroids.
|
Clustering Software |
Gmeans
is a C++ program for clustering. At the heart of the program is the K-means clustering algorithm
with four different distance (similarity) measures, six various initialization methods,
and a powerful local search strategy called first variation.
|
Visualization Software |
CViz
is a visualization tool designed for analyzing high-dimensional data (data with many elements) in large,
complex data sets. CViz easily loads the data sets, displays the most important factors relating clusters of records,
and provides full-motion visualization of the inherent data clusters.
|
|