Abstract

Motivation: With the rapid growth rate of newly sequenced genomes, species tree inference from multiple genes has become a basic bioinformatics task in comparative and evolutionary biology. However, accurate species tree estimation is difficult in the presence of gene tree discordance, which is often due to incomplete lineage sorting (ILS), modelled by the multi-species coalescent. Several highly accurate coalescent-based species tree estimation methods have been developed over the last decade, including MP-EST. However, the running time for MP-EST increases rapidly as the number of species grows. Results: We present divide-and-conquer techniques that improve the scalability of MP-EST so that it can run efficiently on large datasets. Surprisingly, this technique also improves the accuracy of species trees estimated by MP-EST, as our study shows on a collection of simulated and biological datasets.

Original languageEnglish (US)
Article numberS7
JournalBMC genomics
Volume15
Issue number6
DOIs
StatePublished - Oct 17 2014

Fingerprint

Expressed Sequence Tags
Computational Biology
Genes
Genome
Growth
Datasets

Keywords

  • Disk covering methods
  • Divide-and-conquer
  • Incomplete lineage sorting
  • MP-EST
  • Multi-species coalescent process

ASJC Scopus subject areas

  • Biotechnology
  • Genetics

Cite this

Disk covering methods improve phylogenomic analyses. / Bayzid, Md Shamsuzzoha; Hunt, Tyler; Warnow, Tandy.

In: BMC genomics, Vol. 15, No. 6, S7, 17.10.2014.

Research output: Contribution to journalArticle

Bayzid, Md Shamsuzzoha ; Hunt, Tyler ; Warnow, Tandy. / Disk covering methods improve phylogenomic analyses. In: BMC genomics. 2014 ; Vol. 15, No. 6.
@article{52e8ada2260b40548ce149942772de55,
title = "Disk covering methods improve phylogenomic analyses",
abstract = "Motivation: With the rapid growth rate of newly sequenced genomes, species tree inference from multiple genes has become a basic bioinformatics task in comparative and evolutionary biology. However, accurate species tree estimation is difficult in the presence of gene tree discordance, which is often due to incomplete lineage sorting (ILS), modelled by the multi-species coalescent. Several highly accurate coalescent-based species tree estimation methods have been developed over the last decade, including MP-EST. However, the running time for MP-EST increases rapidly as the number of species grows. Results: We present divide-and-conquer techniques that improve the scalability of MP-EST so that it can run efficiently on large datasets. Surprisingly, this technique also improves the accuracy of species trees estimated by MP-EST, as our study shows on a collection of simulated and biological datasets.",
keywords = "Disk covering methods, Divide-and-conquer, Incomplete lineage sorting, MP-EST, Multi-species coalescent process",
author = "Bayzid, {Md Shamsuzzoha} and Tyler Hunt and Tandy Warnow",
year = "2014",
month = "10",
day = "17",
doi = "10.1186/1471-2164-15-S6-S7",
language = "English (US)",
volume = "15",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central",
number = "6",

}

TY - JOUR

T1 - Disk covering methods improve phylogenomic analyses

AU - Bayzid, Md Shamsuzzoha

AU - Hunt, Tyler

AU - Warnow, Tandy

PY - 2014/10/17

Y1 - 2014/10/17

N2 - Motivation: With the rapid growth rate of newly sequenced genomes, species tree inference from multiple genes has become a basic bioinformatics task in comparative and evolutionary biology. However, accurate species tree estimation is difficult in the presence of gene tree discordance, which is often due to incomplete lineage sorting (ILS), modelled by the multi-species coalescent. Several highly accurate coalescent-based species tree estimation methods have been developed over the last decade, including MP-EST. However, the running time for MP-EST increases rapidly as the number of species grows. Results: We present divide-and-conquer techniques that improve the scalability of MP-EST so that it can run efficiently on large datasets. Surprisingly, this technique also improves the accuracy of species trees estimated by MP-EST, as our study shows on a collection of simulated and biological datasets.

AB - Motivation: With the rapid growth rate of newly sequenced genomes, species tree inference from multiple genes has become a basic bioinformatics task in comparative and evolutionary biology. However, accurate species tree estimation is difficult in the presence of gene tree discordance, which is often due to incomplete lineage sorting (ILS), modelled by the multi-species coalescent. Several highly accurate coalescent-based species tree estimation methods have been developed over the last decade, including MP-EST. However, the running time for MP-EST increases rapidly as the number of species grows. Results: We present divide-and-conquer techniques that improve the scalability of MP-EST so that it can run efficiently on large datasets. Surprisingly, this technique also improves the accuracy of species trees estimated by MP-EST, as our study shows on a collection of simulated and biological datasets.

KW - Disk covering methods

KW - Divide-and-conquer

KW - Incomplete lineage sorting

KW - MP-EST

KW - Multi-species coalescent process

UR - http://www.scopus.com/inward/record.url?scp=84971249381&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84971249381&partnerID=8YFLogxK

U2 - 10.1186/1471-2164-15-S6-S7

DO - 10.1186/1471-2164-15-S6-S7

M3 - Article

C2 - 25572610

AN - SCOPUS:84971249381

VL - 15

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

IS - 6

M1 - S7

ER -