UTCS Colloquium: Alexandros Stamatakis Ecole Polytechnique Federale de Lausanne/School of Comp & Communication Sciences Faster Algorithms for Support Value Computation & Emerging Parallel Architectures for Phylogeny Reconstruction TAY 3.128 (East
There is a signup schedule for this event.
Speak
er: Alexandros Stamatakis
Affiliation: Ecole Polytechnique Federal
e de Lausanne/School of Comp & Communication Sciences
Date/Time: 2:
00 p.m. - 3:30 p.m.
Location: TAY 3.128 - East wall (chalkboard)
Host: Tandy Warnow
Talk Title: Faster Algorithms for Support
Value Computation & Emerging
Parallel Architectures fo
r Phylogeny Reconstruction
Talk Abstract:
Despite the impressive
progress that has been achieved with the new
generation of Maximum Likel
ihood (ML) search algorithms the computation
of support values based o
n non-parametric bootstrapping (BS) still represents
a major computation
al challenge.
Initially I will discuss why the Randomized Estimated
Log Likelihood (RELL)
method is probably very hard to apply to large re
al-world datasets. Thereafter
I will present new heuristics to acceler
ate the BS procedure in RAxML
(Randomized Axelerated Maximum Likelihood)
. In comparison to the standard
BS procedure these heuristics yield run
time improvements between factor 7
on datasets with 500 sequences up to
factor 14 on 1 700 sequences. At the
same time the support values obta
ined by the new BS heuristics show
correlation coefficients ranging bet
ween 0.94 and 0.96 compared to those
obtained via the standard method.
In absolute numbers this means that 100
bootstrap replicates on single-g
ene datasets up to 2 000 taxa can be
conducted within less than 24 hours
on a single - reasonably fast - processor.
In the second part of my
talk I will outline how the computation of large
multi-gene datasets wi
th ML can efficiently be parallelized on hardware
platforms with very di
stinct architectures such as the IBM Cell and the IBM
BlueGene. The para
llelization on BlueGene scales well up to 512 processors
on the largest
dataset analyzed under ML to date which consists of 270
sequences and
500 000 base pairs.
I will conclude with an overview of current work
on related projects.
Related papers (PDF) and software (open source
code for Mac/Linux)
available at: icwww.epfl.ch/%7Estamata
- About
- Research
- Faculty
- Awards & Honors
- Undergraduate
- Graduate
- Careers
- Outreach
- Alumni
- UTCS Direct