Abstract

Species tree estimation from multi-locus datasets is complicated by processes such as incomplete lineage sorting (ILS) that result in different loci having different trees. Summary methods, which estimate species trees by combining gene trees, are popular but their accuracy is impaired by gene tree estimation error. Other approaches have been developed that only use the site patterns to estimate the species tree, and so are not impacted by gene tree estimation issues. In particular, PAUP provides a method in which SVDquartets is used to compute a set Q of quartet trees (i.e., trees on four leaves), and then a heuristic search is used to combine the quartet trees into a species tree T, seeking to maximize the number of quartet trees in Q that agree with T. The PAUP method based on SVDquartets (henceforth referred to as SVDquartets + PAUP) is increasingly used in phylogenomic studies due to its ability to reconstruct species trees without needing to estimate accurate gene trees. We present SVDquest, a new method for constructing species trees using site patterns that is guaranteed to produce species trees that satisfy at least as many quartet trees as SVDquartets + PAUP. We show that SVDquest is competitive with ASTRAL and ASTRID (two leading summary methods) in terms of topological accuracy, and tends to be more accurate than ASTRAL and ASTRID under conditions with relatively high gene tree estimation error. SVDquest is available in open source form at https://github.com/pranjalv123/SVDquest.

Original languageEnglish (US)
Pages (from-to)122-136
Number of pages15
JournalMolecular Phylogenetics and Evolution
Volume124
DOIs
StatePublished - Jul 2018

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Genetics

Fingerprint

Dive into the research topics of 'SVDquest: Improving SVDquartets species tree estimation using exact optimization within a constrained search space'. Together they form a unique fingerprint.

Cite this