GRACS Speaker Series-Kevin Liu/UT-Austin: "Rapid and Accurate Estimation of Large-scale Phylogenies and Sequence Alignments," TAY 3.128, Tuesday, May 4, 2010, 2:00 p.m.
Type of Talk: GRACS Speaker Series
Speaker/Affiliation:
Kevin Liu/University of Texas at Austin
Date/Time: Tuesday, May 4,
2010, 2:00 p.m.
Location: TAY 3.128
Host: GRACS
Talk
Title:Rapid and Accurate Estimation of Large-scale Phylogenies and Sequence
Alignments
Talk Abstract:
Computational phylogenetics creates
and evaluates algorithms
that use present day biological sequence data
to estimate
evolutionary history. This history is represented both as
a
phylogeny and a multiple sequence alignment. Scientists from
b
iology, chemistry, and other fields use phylogenies and
alignments t
o address many different problems, including the
origin of life, epi
demiology, proteomics, and biomedical
research.
In the si
mplest case, a phylogeny is represented as a tree.
The tree''s leaves
represent present-day groups of organisms,
also known as taxa, and
the tree''s edges show how those taxa
are related. A multiple sequence
alignment shows
relationships among the sequences themselves by inser
ting
dashes ``-'''' , called indels, into sequences to line up
sequence letters. A pair of lined-up letters represents
homology, or
shared evolutionary ancestry, of the two
letters. Indels represent hi
storical insertion or deletion of
subsequences.
Most comput
ational phylogenetic studies proceed by aligning
sequences in the firs
t phase, and then estimating a tree
using the alignment in the second
phase. The main advantage
of these so-called two-phase methods is the
ir speed. However,
two-phase methods are also inaccurate under modera
te to high
rates of evolution.
To address these and other s
hortcomings, a new generation of
methods simultaneously estimate an a
lignment and tree from an
input of unaligned sequences. These methods
are either
prohibitively slow or have not been shown to be more accura
te
than the best two-phase methods in practice.
Due to expo
nential growth in sequence data and computing
power, biologists now p
erform phylogenetic studies that have
grown by orders of magnitude in
terms of number of taxa,
sequence length, and number of markers, as
compared with past
decades. Moreover, these datasets span greater ev
olutionary
timescales and involve more complex evolutionary events tha
n
ever before. As the ambitions of phylogenetic studies grow,
th
e state of the art of computational phylogenetic algorithms
must keep
up *both* in terms of scalability and accuracy.
To this end, I
present SATe, short for Simultaneous Alignment
and Tree Estimation. S
ATe is the first algorithm that can
accurately estimate phylogenies an
d alignments with thousands
of taxa and thousands of aligned sites. Th
is work is part of
my larger goal of creating algorithms for large-sca
le,
accurate estimation of evolutionary history under complex
ev
olutionary models.
Speaker Bio:
Kevin Liu is a Ph.D. student in
the Warnow lab in the
Department of Computer Science at the Universit
y of Texas at
Austin.
- About
- Research
- Faculty
- Awards & Honors
- Undergraduate
- Graduate
- Careers
- Outreach
- Alumni
- UTCS Direct