TY - JOUR
T1 - SVDquest
T2 - Improving SVDquartets species tree estimation using exact optimization within a constrained search space
AU - Vachaspati, Pranjal
AU - Warnow, Tandy
N1 - Publisher Copyright:
© 2018 Elsevier Inc.
PY - 2018/7
Y1 - 2018/7
N2 - Species tree estimation from multi-locus datasets is complicated by processes such as incomplete lineage sorting (ILS) that result in different loci having different trees. Summary methods, which estimate species trees by combining gene trees, are popular but their accuracy is impaired by gene tree estimation error. Other approaches have been developed that only use the site patterns to estimate the species tree, and so are not impacted by gene tree estimation issues. In particular, PAUP∗ provides a method in which SVDquartets is used to compute a set Q of quartet trees (i.e., trees on four leaves), and then a heuristic search is used to combine the quartet trees into a species tree T, seeking to maximize the number of quartet trees in Q that agree with T. The PAUP∗ method based on SVDquartets (henceforth referred to as SVDquartets + PAUP∗) is increasingly used in phylogenomic studies due to its ability to reconstruct species trees without needing to estimate accurate gene trees. We present SVDquest∗, a new method for constructing species trees using site patterns that is guaranteed to produce species trees that satisfy at least as many quartet trees as SVDquartets + PAUP∗. We show that SVDquest∗ is competitive with ASTRAL and ASTRID (two leading summary methods) in terms of topological accuracy, and tends to be more accurate than ASTRAL and ASTRID under conditions with relatively high gene tree estimation error. SVDquest∗ is available in open source form at https://github.com/pranjalv123/SVDquest.
AB - Species tree estimation from multi-locus datasets is complicated by processes such as incomplete lineage sorting (ILS) that result in different loci having different trees. Summary methods, which estimate species trees by combining gene trees, are popular but their accuracy is impaired by gene tree estimation error. Other approaches have been developed that only use the site patterns to estimate the species tree, and so are not impacted by gene tree estimation issues. In particular, PAUP∗ provides a method in which SVDquartets is used to compute a set Q of quartet trees (i.e., trees on four leaves), and then a heuristic search is used to combine the quartet trees into a species tree T, seeking to maximize the number of quartet trees in Q that agree with T. The PAUP∗ method based on SVDquartets (henceforth referred to as SVDquartets + PAUP∗) is increasingly used in phylogenomic studies due to its ability to reconstruct species trees without needing to estimate accurate gene trees. We present SVDquest∗, a new method for constructing species trees using site patterns that is guaranteed to produce species trees that satisfy at least as many quartet trees as SVDquartets + PAUP∗. We show that SVDquest∗ is competitive with ASTRAL and ASTRID (two leading summary methods) in terms of topological accuracy, and tends to be more accurate than ASTRAL and ASTRID under conditions with relatively high gene tree estimation error. SVDquest∗ is available in open source form at https://github.com/pranjalv123/SVDquest.
UR - http://www.scopus.com/inward/record.url?scp=85044125131&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85044125131&partnerID=8YFLogxK
U2 - 10.1016/j.ympev.2018.03.006
DO - 10.1016/j.ympev.2018.03.006
M3 - Article
C2 - 29530498
AN - SCOPUS:85044125131
SN - 1055-7903
VL - 124
SP - 122
EP - 136
JO - Molecular Phylogenetics and Evolution
JF - Molecular Phylogenetics and Evolution
ER -