Data Mining: A Statistical Learning Perspective

CS 395T/CAM 395T

CS Unique No. 55905 / CAM Unique No. 65780

Spring 2008
MW 9:30-11am
PAR 303

Instructor: Inderjit Dhillon (send email)
Office: ACES 2.332
Office Hours: M 11am-noon and by appointment

TA: Prateek Jain
Office: TAY 137
Office Hours: Tue 3-4pm and Thurs 3-4pm

Course Description

This graduate course will focus on various mathematical and statistical aspects of data mining. Topics covered include supervised learning (regression, classification, support vector machines) and unsupervised learning (clustering, principal components analysis, dimensionality reduction). The technical tools used in the course draw from linear algebra, multivariate statistics and optimization.

Pre-requisites: M341 or equivalent.


  • Course Information (contains grading information) handed out on first class day.
  • Class Survey.
  • Textbook

  • "Pattern Recognition and Machine Learning" by C. Bishop, Springer, 2006.
  • Homeworks

  • Homework 1, Solutions.
  • Homework 2
  • Homework 3, Solutions.
  • Class Presentations

  • Schedule
  • Suggestions for Paper Readings

    Class Projects

  • Project Suggestions
  • Other References

  • "Elements of Statistical Learning: Data Mining, Inference, and Prediction" by T. Hastie, R. Tibshirani, J. Friedman, Springer-Verlag, 2001.
  • "Pattern Classification" by R. Duda, P. Hart and D. Stork, John Wiley and Sons, 2000.