Next: Evaluation metrics
Up: Experiment
Previous: Experiment
The dataset is split into a training set and a test set. For each user,
of ratings is kept for training (randomly selected) and the rest is for testing. The final training has
ratings, and the testing contains
ratings. Since the rating matrix is non-negative, we run experiments on both squared Euclidean and I-divergence which are suitable Bregman divergences for the nonnegative rating matrix. Since the prediction performance of the co-clustering algorithm depends much on the number of row and column clusters, we run experiments on 6 pair values of these numbers 2x2, 5x5, 10x10, 15x15, 20x20, and 25x25.
Moreover, like KMeans, the Bregman co-clustering also depends on the initial row and column clusters. Therefore, we use two different ways to initialize these row and column clusters. The first one is to randomly initialize these clusters and we run the co-clustering 10 times to find the best performance. The second way is to use Graclus
clustering results as initial values. With Graclus, only squared Euclidean is used, the number of row and column clusters are limited to 2x2, 5x5, 10x10, 15x15, 20x20, and the number of local search is varied from 0 to 45 with step size of 5.
Next: Evaluation metrics
Up: Experiment
Previous: Experiment
Tuyen Huynh
2007-05-09