## Data Mining: A Mathematical Perspective

## CS 391D/CAM 395T

### CS Unique No. 54950 / CAM Unique No. 66117

Fall 2009

TTh 9:30-11am

WEL 2.312

Instructor: Prof. Inderjit Dhillon
(send email)

Office: ACES 2.332

Office Hours: Tue 11am-noon and by appointment

TA: Wei Tang
(send email)

Office: TAY 137

Office Hours: MW 3:30-5:30pm
### Course Description

Data mining is the automated discovery of interesting patterns and
relationships in massive data sets.
This graduate course will focus on various mathematical and statistical
aspects of data mining. Topics covered include supervised methods
(regression, classification) and unsupervised
methods (clustering, principal components analysis, dimensionality
reduction). The technical tools used in the course will draw from linear
algebra, multivariate statistics and optimization.
The main tools from these areas will be covered
in class, but undergraduate level linear algebra is a pre-requisite (see below).
A substantial portion of the course will focus on research
projects, where students will choose a well defined research problem.
Projects can vary in their theoretical/mathematical
content, and in the implementation/programming involved.
Projects will be conducted by teams of 2-3 students.

Pre-requisites: Basics (undergraduate level) of linear algebra (M341 or equivalent) and some mathematical sophistication.

### Books

### Reading Material

### Homeworks

### Class Presentations

### Class Projects

### Grading