TY - GEN
T1 - Statistically Consistent Rooting of Species Trees Under the Multispecies Coalescent Model
AU - Tabatabaee, Yasamin
AU - Roch, Sébastien
AU - Warnow, Tandy
N1 - Acknowledgments. SR was supported by NSF grants DMS-1902892, DMS1916378 and DMS-2023239 (TRIPODS Phase II), as well as a Vilas Associates Award. TW was supported by the Grainger Foundation. SR thanks C\u00E9cile An\u00E9and her group for helpful discussions. YT thanks Mohammed El-Kebir for helpful suggestions on an earlier version of this work. The authors thank the reviewers for their feedback.
PY - 2023
Y1 - 2023
N2 - Rooted species trees are used in several downstream applications of phylogenetics. Most species tree estimation methods produce unrooted trees and additional methods are then used to root these unrooted trees. Recently, Quintet Rooting (QR) (Tabatabaee et al., ISMB and Bioinformatics 2022), a polynomial-time method for rooting an unrooted species tree given unrooted gene trees under the multispecies coalescent, was introduced. QR, which is based on a proof of identifiability of rooted 5-taxon trees in the presence of incomplete lineage sorting, was shown to have good accuracy, improving over other methods for rooting species trees when incomplete lineage sorting was the only cause of gene tree discordance, except when gene tree estimation error was very high. However, the statistical consistency of QR was left as an open question. Here, we present QR-STAR, a polynomial-time variant of QR that has an additional step for determining the rooted shape of each quintet tree. We prove that QR-STAR is statistically consistent under the multispecies coalescent model, and our simulation study shows that QR-STAR matches or improves on the accuracy of QR. QR-STAR is available in open source form at https://github.com/ytabatabaee/Quintet-Rooting.
AB - Rooted species trees are used in several downstream applications of phylogenetics. Most species tree estimation methods produce unrooted trees and additional methods are then used to root these unrooted trees. Recently, Quintet Rooting (QR) (Tabatabaee et al., ISMB and Bioinformatics 2022), a polynomial-time method for rooting an unrooted species tree given unrooted gene trees under the multispecies coalescent, was introduced. QR, which is based on a proof of identifiability of rooted 5-taxon trees in the presence of incomplete lineage sorting, was shown to have good accuracy, improving over other methods for rooting species trees when incomplete lineage sorting was the only cause of gene tree discordance, except when gene tree estimation error was very high. However, the statistical consistency of QR was left as an open question. Here, we present QR-STAR, a polynomial-time variant of QR that has an additional step for determining the rooted shape of each quintet tree. We prove that QR-STAR is statistically consistent under the multispecies coalescent model, and our simulation study shows that QR-STAR matches or improves on the accuracy of QR. QR-STAR is available in open source form at https://github.com/ytabatabaee/Quintet-Rooting.
KW - Multispecies Coalescent
KW - Rooting
KW - Species Tree Estimation
KW - Statistical Consistency
UR - http://www.scopus.com/inward/record.url?scp=85152546349&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85152546349&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-29119-7_3
DO - 10.1007/978-3-031-29119-7_3
M3 - Conference contribution
AN - SCOPUS:85152546349
SN - 9783031291180
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 41
EP - 57
BT - Research in Computational Molecular Biology - 27th Annual International Conference, RECOMB 2023, Proceedings
A2 - Tang, Haixu
PB - Springer
T2 - 27th International Conference on Research in Computational Molecular Biology, RECOMB 2023
Y2 - 16 April 2023 through 19 April 2023
ER -