next up previous
Next: Bibliography Up: Using Co-clustering for Predicting Previous: Results with local SVD

Conclusion

In this final project we have tried many different combinations of the Bregman co-clustering framework, Graclus, and SVD to predict movie ratings for a subset of the Netflix dataset. The first method is to use co-clustering for finding the best matrix approximation based six different criteria. The two best schemes 3 and 5 achieve the comparative results which are 0.9103 and 0.9109 on the RMSE evaluation metric respectively. If these schemes can achieve this performance on the original Netflix, the approach has made about 4% of improvement. The second method is to use the clustering results of Graclus as the initial row and column clusters for the co-clustering algorithm. We did not observe any improvement with this method. The third and fourth methods are to apply SVD on each cocluster resulted from co-clustering or Graclus. These approaches show the significant improvement. The application of SVD on scheme 3 with the setting 20x20 reduces the RMSE from 0.91 to 0.9039. The local SVD really helps to improve the movie rating prediction. Using co-clustering or Graclus we can reduce scale down the dataset and make the simple version of SVD feasible on these coclusters.

There are some future work to be explored. The first one is to apply other techniques to handle the filling in missing values of SVD. In [8], the authors present an EM approach based on SVD. Another way is to use the incremental SVD proposed in [5] which computes the SVD of a matrix by iterating over the known values sequentially. This method has been applied to the Netflix problem and gave good performance.

The second approach is to apply local regression instead of local SVD on each cocluster. From the data analysis step we learn many interesting features of the Netflix dataset. The ratings of best and worst movies are quite stable. The good movies have more votes than the bad ones. The users with a high number of ratings seem give us more information than those rarely rate. All of these features combined with the average local ratings of users and movies plus the global ones could be good variables to predict the missing ratings.


next up previous
Next: Bibliography Up: Using Co-clustering for Predicting Previous: Results with local SVD
Tuyen Huynh 2007-05-09