TY - JOUR
T1 - On the Robustness to Gene Tree Estimation Error (or lack thereof) of Coalescent-Based Species Tree Methods
AU - Roch, Sebastien
AU - Warnow, Tandy
N1 - Publisher Copyright:
© 2015 The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved.
PY - 2015/7/1
Y1 - 2015/7/1
N2 - The estimation of species trees using multiple loci has become increasingly common. Because different loci can have different phylogenetic histories (reflected in different gene tree topologies) for multiple biological causes, new approaches to species tree estimation have been developed that take gene tree heterogeneity into account. Among these multiple causes, incomplete lineage sorting (ILS), modeled by the multi-species coalescent, is potentially the most common cause of gene tree heterogeneity, and much of the focus of the recent literature has been on how to estimate species trees in the presence of ILS. Despite progress in developing statistically consistent techniques for estimating species trees when gene trees can differ due to ILS, there is substantial controversy in the systematics community as to whether to use the new coalescent-based methods or the traditional concatenation methods. One of the key issues that has been raised is understanding the impact of gene tree estimation error on coalescent-based methods that operate by combining gene trees. Here we explore the mathematical guarantees of coalescent-based methods when analyzing estimated rather than true gene trees. Our results provide some insight into the differences between promise of coalescent-based methods in theory and their performance in practice.
AB - The estimation of species trees using multiple loci has become increasingly common. Because different loci can have different phylogenetic histories (reflected in different gene tree topologies) for multiple biological causes, new approaches to species tree estimation have been developed that take gene tree heterogeneity into account. Among these multiple causes, incomplete lineage sorting (ILS), modeled by the multi-species coalescent, is potentially the most common cause of gene tree heterogeneity, and much of the focus of the recent literature has been on how to estimate species trees in the presence of ILS. Despite progress in developing statistically consistent techniques for estimating species trees when gene trees can differ due to ILS, there is substantial controversy in the systematics community as to whether to use the new coalescent-based methods or the traditional concatenation methods. One of the key issues that has been raised is understanding the impact of gene tree estimation error on coalescent-based methods that operate by combining gene trees. Here we explore the mathematical guarantees of coalescent-based methods when analyzing estimated rather than true gene trees. Our results provide some insight into the differences between promise of coalescent-based methods in theory and their performance in practice.
KW - coalescent-based methods
KW - gene tree estimation error
KW - incomplete lineage sorting
KW - multi-species coalescent
KW - species tree reconstruction
KW - statistical consistency
UR - http://www.scopus.com/inward/record.url?scp=84931054390&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84931054390&partnerID=8YFLogxK
U2 - 10.1093/sysbio/syv016
DO - 10.1093/sysbio/syv016
M3 - Article
C2 - 25813358
AN - SCOPUS:84931054390
SN - 1063-5157
VL - 64
SP - 663
EP - 676
JO - Systematic biology
JF - Systematic biology
IS - 4
ER -