My research combines mathematics, computer science, probability, and statistics, in order to develop algorithms with improved accuracy for large-scale and complex estimation problems in phylogenomics and metagenomics. My major interests include multiple sequence alignment and phylogeny estimation (both gene trees and species trees) and metagenomic analysis, but I also work in Historical Linguistics. My current work aims to develop methods for ultra-large datasets (anywhere from 10,000 to 1,000,000 sequences), including datasets that are highly fragmentary and present other real world challenges. We use real data and perform massive simulations to evaluate the performance of methods that we develop, and also collaborate closely with biologists and linguists in data analysis. I will be moving to the University of Illinois at Urbana-Champaign in Fall 2014, and will be a Professor with a split position between Bioengineering and Computer Science, and a courtesy appointment in Mathematics.


Our current collaborations include the 1KP (Thousand Transcriptome Project) and the Avian Phylogenomics Project. These collaborations include data analysis and the development of new methods for estimating alignments and trees (both gene trees and species trees). We welcome collaborations with biologists who have data that are difficult to analyze, either because the datasets are too large for current methods, or because current methods fail to have sufficiently high accuracy.


My current research is funded by the National Science Foundation (DEB 0733029 and DBI-1062335). I also recently benefited from support of the John P. Simon Guggenheim Foundation, and early support from the David and Lucile Packard Foundation, the Radcliffe Institute for Advanced Study at Harvard University, and the Program for Evolutionary Dynamics at Harvard University.

Recent and Upcoming Symposia and Software Schools

"Plus de détails, plus de détails, disait-il à son fils, il n'y a d'originalité et de vérité que dans les détails..." -- Stendhal, Lucien Leuwen (a quote much loved by my stepfather, Martin J. Klein, and an essential guide for all scholarship).

Click here for Google Scholar Citations (i10-index 101 and h-index 46).

The CIPRES Project is finally over, but the CIPRES Portal is still available through the TeraGrid (see also the CIPRES Science Gateway story). Here is an abridged version of the CIPRES final report.

Siavash Mirarab, my PhD student, has been awarded a fellowship by the Howard Hughes Medical Institute.



For prospective students

Current and former students and postdocs


MenieModa, fashion blog

SATé and PASTA software

Lab Website

Downloadable papers

Complete vita and publication list

Brief vita



Computational Historical Linguistics