/* spkmeans-- A Clustering and Dimension Reduction toolkit Copyright(c) 2001 Yuqiang Guan */ This is the README file for spkmeans. -i f gives the initial seeding file; otherwise the initial clusters will be random-perturb -c gives the number of clusters -K gives the dimension for dimension reduction -O gives name of the output cluster matrix -t gives scaling, say 'txx', 'tfn', etc. -D [c|q|b] does concept decomposition. 'c' is concept decomposition, 'q' is QR decomposition of concept matrix 'b' is 'c'+'q' -T gives true label file -e gives epsilon -n disables dump infomation(confusion matrix, purity, etc.) * If you just want to cluster: spkmeans -c 3 [-i f initial_seeds_file] [-T true label file] -t tfn -O classic3 ./classic3 * If you just want dimension reduction: spkmeans -K 10 -i [g|f] [graph-part-file|initial_seeds_file] -t tfn -O classic3 -D [c|q|b] ./classic3 * If you want dimension reduction first and then cluster the dimension-reduced matrix: spkmeans -c 3 -K 10 -i [g|f] [graph-part-file|initial_seeds_file -t tfn -O classic3 -D [c|q|b] ./classic3 The file name is the prefix of 5 matrix file names (_dim, _col_ccs, _row_ccs, _tfn_nz and _docs) * True label file has the format of #of Docs. 1 ClusterID [x /path/file] ... * Initial seeding file has the format of #of Docs. 1 ClusterID [x /path/file] ... where ClusterID should cover 0 to #ofCLusters. IDs that are outof this range will be ignored.