Fast Converging Phylogenetic Methods
Our research in this area combines
algorithm
design with simulation studies and probabilistic analyses of the performance of
phylogenetic
methods under various models of biomolecular sequence evolution. The performance
of
phylogenetic methods is evaluated according to two criteria: statistical
performance,
which addresses the accuracy of the method under a specified stochastic model of
evolution, and computational performance, which addresses the computational
requirements of the method. One of the more demanding statistical criterion is
the
convergence rate, which can be interpreted as the sequence length needed for a
method
to recover the true tree with probability 1-epsilon. Some of the recent
developments
in computational phylogenetics include the first analytical bounds on the
performance of
existing methods under various models of biomolecular sequence
evolution and the first methods guaranteed to recover the true tree
from sequences of polynomial length.
Such "fast converging" methods may provide
key techniques
for the reconstruction of large, evolutionary divergent phylogenetic trees.
      
This research will continue the effort to design new polynomial-time
algorithms with polynomial
convergence rates. This research will extend the basic theory of convergence
rates of
phylogenetic methods as well as develop new "fast converging" methods.
All methods developed
will be studied extensively using simulations of biomolecular sequence
evolution
as well as
real datasets.
      
The goal of the research is to develop algorithms that
have provable performance
guarantees,
that provide measurable performance improvements on both real and simulated
datasets, and
that outperform, with respect to both running time and accuracy, all currently
used methods,
so that the algorithms will come into use within the biology community. The
research will
be based on results that were recently obtained in the development of
"fast converging" methods, in the experimental assessment of phylogenetic
methods, and in the
implementation and performance engineering of algorithms for hard optimization
problems in
phylogeny.
People
-
Tandy Warnow,
Department of Computer Sciences, UT-Austin
-
Bernard Moret,
Department of Computer Sciences, University of New Mexico
Publications
-
"A few logs suffice to build (almost) all trees (Part 1)," by
P.L. Erdös, M.A. Steel, L.A. Székely, and T. Warnow.
Random Structures and Algorithms, 14(2), 1999:153-184.
(pdf)
-
"A few logs suffice to build almost all trees (Part II),"
by
P.L. Erdös, M. Steel, L. Székély, and T. Warnow.
Theoretical Computer
Science, 221, pp. 77-118, 1999 (by invitation, special issue
of selected papers from ICALP 1997.)
(Also appears as DIMACS Technical Report 97-72.)
(pdf)
(postscript)
-
"Disk-Covering, a fast converging method for phylogenetic
tree reconstruction", by
D. Huson, S. Nettles, and T. Warnow.
Special issue of the
Journal of Computational Biology for
selected papers from RECOMB 1999, Vol. 6, No. 3, 1999,
pp. 369-386.
(This appeared in a preliminary form in
the Proceedings of RECOMB 1999, as
Obtaining highly accurate topology estimates of
evolutionary trees from very short sequences.
Lyon, France.)
(pdf)
(postscript)
-
"Absolute Convergence:
True Trees From Short Sequences",
by
Tandy Warnow, Bernard Moret, and Katherine St. John. 2001.
Appeared in the ACM-SIAM Symposium on Discrete Algorithms (SODA) 2001.
(pdf)
(postscript)
-
"Designing Fast Converging Phylogenetic Methods", with Tandy Warnow,
Katherine St. John, Luay Nakhleh, Usman Roshan, and Jerry Sun
(presented at ISMB 2001 (Intelligent Systems for Molecular Biology, also
appear in special edition of Bioinformatics, Vol. 17 Suppl. 1 2001, pp.
S190-S198).
(pdf)
(postcript)
-
"The
Performance of Phylogenetic Methods on Trees of Bounded Diameter",
with Tandy Warnow, Katherine St. John, Luay Nakhleh, Usman Roshan and Jerry Sun
In Lecture Notes for
Computer Science No. 2149, pages 227-237: Proceedings of the First Workshop on Algorithms
in Bioinformatics (WABI 2001).
(postcript,
pdf)
-
"The Accuracy of Fast Phylogenetic Methods for Large Datasets", with
L. Nakhleh, B.M.E. Moret, U. Roshan, K. St. John, J. Sun, and T. Warnow.
In Proceedings of the Seventh Pacific Symposium on Biocomputing (PSB 2002), 7:211-222.
(ps)
Presentations
The following presentation is
on fast converging methods.
(PDF)
Fast converging phylogenetic methods
(Stochastic Processes and Applications, June 27, 2005).