WekaUT contains a modified version of Weka. Some of the modifications include classes to handle semi-supervised clustering of text data.

DISCLAIMER: Currently the code works with java-1.4.2.

To untar and compile:
    tar -xvzf wekaUT.tar.gz
    cd wekaUT/weka-latest
    Set the INSTALL_DIR variable in the Makefile to your wekaUT install dir
    Set the JAVADIR variable in the Makefile to your java-1.4.2 install dir
    make

To run MPCKMeans (metric pairwise constrained KMeans) on iris data without constraints:
    java weka/clusterers/MPCKMeans -D data/iris.arff

To run MPCKMeans (metric pairwise constrained KMeans) on iris data with constraints:
    java weka/clusterers/MPCKMeans -D data/iris.arff -C data/iris.constraints

Description of MPCKMeans options: MPCKMeans.options

Short options for running MPCKMeans on iris: iris.options

Detailed options for running MPCKMeans on iris: iris.longoptions

Sample constraints file for iris: iris.constraints
   Note that every line in this file is of the format:     Instance1 Index \t Instance2 Index \t 1/-1 (1 => ML, -1 => CL)
   The instance indices start from 0.


Back to RISC