UTCS AI Colloquia - Carla Brodley, "Class-Level Constraint-Based Clustering and its Application to Remote Sensing, AAAI 2014 Keywords and Multiple Sclerosis"

Contact Name: 
Karl Pichotta
GDC 6.302
Apr 3, 2014 3:00pm - 4:00pm
Peter Stone

Signup Schedule: http://apps.cs.utexas.edu/talkschedules/cgi/list_events.cgi

Talk Audience: UTCS Faculty, Grads, Undergrads, Other Interested Parties

Host:  Peter Stone

Talk Abstract: We present class-level constraint-based clustering motivated by two new general applications: redefining class definitions via constraint-based clustering and removing confounding factors when clustering. Class definitions for supervised machine learning are often created for a particular end-use with limited regard as to whether the data supports these distinctions. There are two potential issues from the point of view of creating an accurate classifier. First, the features may not support the required class distinctions. Second, class definitions may change over time and thus need to be re-examined. We present and evaluate our proposed solution of class-level constraint-based clustering algorithm in the context of two motivating domains: redefining the landcover classification scheme for creating maps of global land cover of the Earth, and rediscovering the set of keywords for AAAI 2014. The second half of the talk proposes an approach to applying constraint-based clustering to remove confounding factors, which if left in the data can lead to undesirable clustering results. For example in medical datasets, age is often a confounding factor in tests designed to judge the severity of a patient's disease through measures of mobility, eyesight and hearing. In such cases, removing age from each instance will not remove its effect from the data as other features will be correlated with age. Motivated by the need to find homogeneous groups of multiple sclerosis patients, we apply our approach to remove physician subjectivity from patient data. The result is a promising novel grouping of patients that can help uncover the factors that impact disease progression in MS.

Speaker Bio: Carla E. Brodley is a professor in the Department of Computer Science at Tufts University and holds a secondary appointment in the Clinical and Translational Science Institute, Tufts Medical Center. She received her PhD in computer science from the University of Massachusetts, at Amherst in 1994. From 1994-2004, she was on the faculty of the School of Electrical Engineering at Purdue University, West Lafayette, Indiana. She joined the faculty at Tufts in 2004. Professor Brodley's research interests include machine learning, knowledge discovery in databases, health IT, and personalized medicine. She has worked in the areas of intrusion detection, anomaly detection, classifier formation, unsupervised learning and applications of machine learning to remote sensing, computer security, neuroscience, digital libraries, astrophysics, content-based image retrieval of medical images, computational biology, chemistry, evidence-based medicine, and personalized medicine. She served as chair of the Computer Science Department at Tufts from 2010-2013. In 2001 she served as program co-chair for the International Conference on Machine Learning (ICML) and in 2004, she served as the general chair for ICML. In 2004-2005 she was a member of the Defense Science Study Group. She was a member of the CRA-board of directors from 2008-2012, she was on the AAAI council from 2008-2011 and she co-chaired CRA-W from 2008-2011. Currently she is on the editorial boards of JMLR, Machine Learning and DKMD, she is a board member of the International Machine Learning Society, she is co-chairing AAAI in 2014, and she is a member of ISAT.