Datasets for testing ILS-based species tree estimation methods

This webpage provides datasets used in two papers -- one by J. Yang and T. Warnow (Fast and accurate methods for phylogenomic analyses, BMC Bioinformatics 2011) and one by M. S. Bayzid and T. Warnow (Naive binning can improve phylogenomic analysis, Bioinformatics, to appear). Please cite the relevant papers if you use these datasets.


Datasets used in "Fast and accurate methods for phylogenomic analyses"

J. Yang, T. Warnow


17-taxon ILS: 500 replicates, at 8 and 32 genes

file contents:

Please cite
  • Yun Yu, Tandy Warnow, and Luay Nakhleh. Algorithms for MDC-based multi-locus phylogeny inference, Proc. RECOMB 2011.
  • Yun Yu, Tandy Warnow, and Luay Nakhleh. Algorithms for MDC-based multi-locus phylogeny inference: Beyond rooted binary gene trees on single alleles. J Comp Biol, 18(11): 1543-1559.

    100-taxon ILS: 10 replicates, 25 genes

    file contents:

    Please cite Jimmy Yang and Tandy Warnow, Fast and accurate methods for phylogenomic analyses, BMC Bioinf 2011 12 (Suppl 9:S4).

    100-taxon nonILS: 6 model conditions, 10 replicates, 25 and 50 genes

    100L2 100L3 100S2 100L2-vbr1 100L3-vbr1 100S2-vbr1

    file contents (replace 100L2 with another model condition for the corresponding file):

    Please cite Jimmy Yang and Tandy Warnow, Fast and accurate methods for phylogenomic analyses, BMC Bioinf 2011 12 (Suppl 9:S4).

    500-taxon nonILS: 6 model conditions, 10 replicates, 25 and 50 genes

    500L5 500S3 500M3 500L5-vbr1 500S3-vbr1 500M3-vbr1

    file contents (replace 500L5 with another model condition for the corresponding file):

    Please cite Jimmy Yang and Tandy Warnow, Fast and accurate methods for phylogenomic analyses, BMC Bioinf 2011 12 (Suppl 9:S4).

    Datasets for "Naive Binning Improves Phylogenomic Analyses"

    Md. Shamsuzzoha Bayzid and Tandy Warnow, Bioinformatics (to appear)

    In addition to the 17-taxon datasets listed above, this paper also studied the following data:

    11-taxon ILS: 100 replicates, 100 genes

    File contents:

    Please cite Y. Chung and C. Ane (2011). Comparing two Bayesian methods for gene tree/ species tree reconstruction: a simulation with incomplete lineage sorting and horizontal gene transfer. Syst Biol 60(3): 261-275.

    17-taxon ILS (estimated gene trees): 100 replicates, at 8 and 32 genes

    File contents:


    Please email tandy AT cs.utexas.edu if you have any questions.