AI Forum: Dr. Rich Caruana Cornell University Which Supervised Learning Method Works Best for What? An Empirical Comparison of Learning Methods and Metrics ++ ACES 2.402
There is a signup schedule for this event.
Speaker Nam
e/Affiliation: Dr. Rich Caruana Cornell Universty
Date/Time: Frida
y February 9 2007 11:00a.m.-Noon
Location: ACES 2.402
Hos
t: Dr. Ray Mooney
Talk Title: Which Supervised Learning Method Work
s Best for What? An Empirical Comparison of Learning Methods and Metrics ++
Talk Abstract:
Decision trees are intelligible but do they perf
orm well
enough that you should use them? Have SVMs replaced neural
n
ets or are neural nets still best for regression and SVMs
best for cla
ssification? Boosting maximizes margins similar to
SVMs but can boostin
g compete with SVMs? And if it does
compete is it better to boost weak
models as theory might
suggest or to boost stronger models? Bagging is
simpler than
boosting -- how well does bagging stack up against boostin
g?
Breiman said Random Forests are better than bagging and as good
as
boosting. Was he right? And what about old friends like
logistic regres
sion KNN and naive bayes? Should they be
relegated to the history book
s or do they still fill important
niches?
In this talk we compar
e the performance of these supervised
learning methods on a number of pr
eformaance criteria:
Accuracy F-score Lift Precision/Recall Break-Eve
n Point
Area under the ROC Average Precision Squared Error
Cross-
Entropy and Probability Calibration. The results show
that no one learn
ing method does it all but some methods can
be repaired so that they do
very well across all performance
metrics. In particular we show how to
obtain the best
probabilities from max margin methods such as SVMs and
boosting
via Platt''s Method and isotonic regression. We then describe a
new ensemble method that combines select models from these ten
learn
ing methods to yield much better performance. Although
these ensembles p
erform extremely well they are too complex
for many applications. We''l
l describe a model compression
method we are developing to fix that. Fin
ally if time permits
we''ll discuss how the performance metrics relate
to each other
and which of them you probably should (or shouldn''t)use
.
Speaker Bio:
Rich Caruana is a professor in the Department of C
omputer
Sciences at Cornell University. He obtained his Ph.D. from
Ca
rnegie Mellon University in 1997. Most of his research is in
machine lea
rning data mining and medical decision making. In
machine learning he
works on inductive transfer (a.k.a.
multitask learning) ensemble learni
ng model calibration
(predicting good probabilities) feature selection
missing
values and artificial neural networks. In general he likes t
o
work on real problems and develop new learning methods by
abstract
ing what is required to make machine learning work well
on those problem
s.
- About
- Research
- Faculty
- Awards & Honors
- Undergraduate
- Graduate
- Careers
- Outreach
- Alumni
- UTCS Direct