|CS378 / SSC358, Spring 2013|
|BUR 134, Mon & Wed 3:30 - 5:00 pm|
In recent years, rapid developments in data collection and storage technologies have led to data sets that are "big" in many senses of the word. Data mining is the automatic discovery of interesting patterns and relationships in such "big data". This undergraduate course will provide an introduction to the topic of data mining, and some statistical principles underlying its key methods. Topics covered will include data preprocessing, regression, classification, clustering, dimensionality reduction, and association analysis.
5+ Homeworks (25%)
(1 Midterm + 1 Final) Exams (40%)
Final Project (30%): Initial Project Milestone (7.5%) due TBD; Final Project Presentation (22.5%) due final day of class
Class Attendance and Participation (5%)
Introduction to Data Mining. P. Tan, M. Steinbach, V. Kumar, Addison Wesley, 2006.
Each student is expected to submit an individually written homework. When using information from papers, or other external sources, please cite this information. The homeworks will be be due the beginning of class on the due date, unless otherwise specified. There will be two "free" late days, that you could use either all on one homework, or on two different homeworks. Otherwise, a homework will be worth 50% if one day late, and 0% if it is two or more days late. It is required to submit all homeworks even if after two days, if you do not want an incomplete grade. There will be a final course project. The list of candidate projects will be provided once the class gets underway.
|Class Particip. policy||
I expect students to participate actively in the class, as well as in the Piazza discussion site.