Abstract

Incomplete lineage sorting (ILS), modelled by the multi-species coalescent, is a process that results in a gene tree being different from the species tree. Because ILS is expected to occur for at least some loci within genome-scale analyses, the evaluation of species tree estimation methods in the presence of ILS is of great interest. Performance on simulated and biological data have suggested that concatenation analyses can result in the wrong tree with high support under some conditions, and a recent theoretical result by Roch and Steel proved that concatenation using unpartitioned maximum likelihood analysis can be statistically inconsistent in the presence of ILS. In this study, we survey the major species tree estimation methods, including the newly proposed “statistical binning” methods, and discuss their theoretical properties. We also note that there are two interpretations of the term “statistical consistency”, and discuss the theoretical results proven under both interpretations.

Original languageEnglish (US)
JournalPLoS Currents
Volume7
Issue numberTREEOFLIFE
DOIs
StatePublished - May 22 2015

Fingerprint

Steel
Genome
Genes
Surveys and Questionnaires

ASJC Scopus subject areas

  • Medicine (miscellaneous)

Cite this

Concatenation analyses in the presence of incomplete lineage sorting. / Warnow, Tandy.

In: PLoS Currents, Vol. 7, No. TREEOFLIFE, 22.05.2015.

Research output: Contribution to journalArticle

@article{4d8ce9a767234a19bc3f70ac3f2bd705,
title = "Concatenation analyses in the presence of incomplete lineage sorting",
abstract = "Incomplete lineage sorting (ILS), modelled by the multi-species coalescent, is a process that results in a gene tree being different from the species tree. Because ILS is expected to occur for at least some loci within genome-scale analyses, the evaluation of species tree estimation methods in the presence of ILS is of great interest. Performance on simulated and biological data have suggested that concatenation analyses can result in the wrong tree with high support under some conditions, and a recent theoretical result by Roch and Steel proved that concatenation using unpartitioned maximum likelihood analysis can be statistically inconsistent in the presence of ILS. In this study, we survey the major species tree estimation methods, including the newly proposed “statistical binning” methods, and discuss their theoretical properties. We also note that there are two interpretations of the term “statistical consistency”, and discuss the theoretical results proven under both interpretations.",
author = "Tandy Warnow",
year = "2015",
month = "5",
day = "22",
doi = "10.1371/currents.tol.8d41ac0f13d1abedf4c4a59f5d17b1f7",
language = "English (US)",
volume = "7",
journal = "PLoS Currents",
issn = "2157-3999",
publisher = "Public Library of Science",
number = "TREEOFLIFE",

}

TY - JOUR

T1 - Concatenation analyses in the presence of incomplete lineage sorting

AU - Warnow, Tandy

PY - 2015/5/22

Y1 - 2015/5/22

N2 - Incomplete lineage sorting (ILS), modelled by the multi-species coalescent, is a process that results in a gene tree being different from the species tree. Because ILS is expected to occur for at least some loci within genome-scale analyses, the evaluation of species tree estimation methods in the presence of ILS is of great interest. Performance on simulated and biological data have suggested that concatenation analyses can result in the wrong tree with high support under some conditions, and a recent theoretical result by Roch and Steel proved that concatenation using unpartitioned maximum likelihood analysis can be statistically inconsistent in the presence of ILS. In this study, we survey the major species tree estimation methods, including the newly proposed “statistical binning” methods, and discuss their theoretical properties. We also note that there are two interpretations of the term “statistical consistency”, and discuss the theoretical results proven under both interpretations.

AB - Incomplete lineage sorting (ILS), modelled by the multi-species coalescent, is a process that results in a gene tree being different from the species tree. Because ILS is expected to occur for at least some loci within genome-scale analyses, the evaluation of species tree estimation methods in the presence of ILS is of great interest. Performance on simulated and biological data have suggested that concatenation analyses can result in the wrong tree with high support under some conditions, and a recent theoretical result by Roch and Steel proved that concatenation using unpartitioned maximum likelihood analysis can be statistically inconsistent in the presence of ILS. In this study, we survey the major species tree estimation methods, including the newly proposed “statistical binning” methods, and discuss their theoretical properties. We also note that there are two interpretations of the term “statistical consistency”, and discuss the theoretical results proven under both interpretations.

UR - http://www.scopus.com/inward/record.url?scp=84958559378&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84958559378&partnerID=8YFLogxK

U2 - 10.1371/currents.tol.8d41ac0f13d1abedf4c4a59f5d17b1f7

DO - 10.1371/currents.tol.8d41ac0f13d1abedf4c4a59f5d17b1f7

M3 - Article

AN - SCOPUS:84958559378

VL - 7

JO - PLoS Currents

JF - PLoS Currents

SN - 2157-3999

IS - TREEOFLIFE

ER -