UTCS Colloquium/AI: Greg Hamerly Baylor University PG-means: learning the number of clusters in data ACES 6.304 Friday May 4 2007 at 10:00 a.m.

Contact Name: 
Jenna Whitney
Date: 
May 4, 2007 10:00am - 11:00am

There is a signup schedule for this event.

Type of Talk:
UTCS Colloquium

Speaker/Affiliation: Greg Hamerly/Baylor Universit

y

Date/Time: Friday May 4 2007 at 10:00 a.m.

Location: AC

ES 6.304

Host: Inderjit Dhillon

Talk Title: PG-means: learn

ing the number of clusters in data

Talk Abstract:
We present a no

vel algorithm called PG-means which is able to learn the
number of clust

ers in a classical Gaussian mixture model. Our method
is robust and eff

icient; it uses statistical hypothesis tests on one-dimensional
projec

tions of the data and model to determine if the examples are well
repre

sented by the model. In so doing we are applying a statistical test for the entire model at once not just on a per-cluster basis. We show that o

ur
method works well in difficult cases such as non-Gaussian data over

lapping
clusters eccentric clusters high dimension and many true clus

ters. Further
our new method provides a much more stable estimate of t

he number of
clusters than existing methods. This was joint work with Yu
Feng presented at
NIPS 06.

Speaker Bio:
Greg Hamerly is an
assistant professor of computer science at Baylor
University. His rese

arch is in machine learning particularly in unsupervised
learning algo

rithms and their applications. He is a primary contributor
to the SimP

oint project which uses unsupervised learning for efficient
computer p

rocessor simulation.