TY - JOUR
T1 - Unblended disjoint tree merging using GTM improves species tree estimation
AU - Smirnov, Vladimir
AU - Warnow, Tandy
N1 - Publisher Copyright:
© 2020 The Author(s).
PY - 2020/4/16
Y1 - 2020/4/16
N2 - Background: Phylogeny estimation is an important part of much biological research, but large-scale tree estimation is infeasible using standard methods due to computational issues. Recently, an approach to large-scale phylogeny has been proposed that divides a set of species into disjoint subsets, computes trees on the subsets, and then merges the trees together using a computed matrix of pairwise distances between the species. The novel component of these approaches is the last step: Disjoint Tree Merger (DTM) methods. Results: We present GTM (Guide Tree Merger), a polynomial time DTM method that adds edges to connect the subset trees, so as to provably minimize the topological distance to a computed guide tree. Thus, GTM performs unblended mergers, unlike the previous DTM methods. Yet, despite the potential limitation, our study shows that GTM has excellent accuracy, generally matching or improving on two previous DTMs, and is much faster than both. Conclusions: The proposed GTM approach to the DTM problem is a useful new tool for large-scale phylogenomic analysis, and shows the surprising potential for unblended DTM methods.
AB - Background: Phylogeny estimation is an important part of much biological research, but large-scale tree estimation is infeasible using standard methods due to computational issues. Recently, an approach to large-scale phylogeny has been proposed that divides a set of species into disjoint subsets, computes trees on the subsets, and then merges the trees together using a computed matrix of pairwise distances between the species. The novel component of these approaches is the last step: Disjoint Tree Merger (DTM) methods. Results: We present GTM (Guide Tree Merger), a polynomial time DTM method that adds edges to connect the subset trees, so as to provably minimize the topological distance to a computed guide tree. Thus, GTM performs unblended mergers, unlike the previous DTM methods. Yet, despite the potential limitation, our study shows that GTM has excellent accuracy, generally matching or improving on two previous DTMs, and is much faster than both. Conclusions: The proposed GTM approach to the DTM problem is a useful new tool for large-scale phylogenomic analysis, and shows the surprising potential for unblended DTM methods.
KW - Divide-and-conquer pipelines
KW - Large-scale phylogeny estimation
KW - Species tree estimation
UR - http://www.scopus.com/inward/record.url?scp=85083479280&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85083479280&partnerID=8YFLogxK
U2 - 10.1186/s12864-020-6605-1
DO - 10.1186/s12864-020-6605-1
M3 - Article
C2 - 32299343
AN - SCOPUS:85083479280
SN - 1471-2164
VL - 21
JO - BMC genomics
JF - BMC genomics
M1 - 235
ER -