TY - JOUR
T1 - An experimental study of Quartets MaxCut and other supertree methods
AU - Swenson, M. Shel
AU - Suri, Rahul
AU - Linder, C. Randal
AU - Warnow, Tandy
N1 - Funding Information:
This research was supported in part by the US National Science Foundation under grants DEB 0733029, 0331453 (CIPRES), and DGE 0114387. We thank Francois Barbancon for assistance early on in the project, Sagi Snir for assistance with using the QMC code and for providing additional software for generating quartet encodings, and the referees for their helpful and detailed comments.
PY - 2011/4/19
Y1 - 2011/4/19
N2 - Background: Supertree methods represent one of the major ways by which the Tree of Life can be estimated, but despite many recent algorithmic innovations, matrix representation with parsimony (MRP) remains the main algorithmic supertree method.Results: We evaluated the performance of several supertree methods based upon the Quartets MaxCut (QMC) method of Snir and Rao and showed that two of these methods usually outperform MRP and five other supertree methods that we studied, under many realistic model conditions. However, the QMC-based methods have scalability issues that may limit their utility on large datasets. We also observed that taxon sampling impacted supertree accuracy, with poor results obtained when all of the source trees were only sparsely sampled. Finally, we showed that the popular optimality criterion of minimizing the total topological distance of the supertree to the source trees is only weakly correlated with supertree topological accuracy. Therefore evaluating supertree methods on biological datasets is problematic.Conclusions: Our results show that supertree methods that improve upon MRP are possible, and that an effort should be made to produce scalable and robust implementations of the most accurate supertree methods. Also, because topological accuracy depends upon taxon sampling strategies, attempts to construct very large phylogenetic trees using supertree methods should consider the selection of source tree datasets, as well as supertree methods. Finally, since supertree topological error is only weakly correlated with the supertree's topological distance to its source trees, development and testing of supertree methods presents methodological challenges.
AB - Background: Supertree methods represent one of the major ways by which the Tree of Life can be estimated, but despite many recent algorithmic innovations, matrix representation with parsimony (MRP) remains the main algorithmic supertree method.Results: We evaluated the performance of several supertree methods based upon the Quartets MaxCut (QMC) method of Snir and Rao and showed that two of these methods usually outperform MRP and five other supertree methods that we studied, under many realistic model conditions. However, the QMC-based methods have scalability issues that may limit their utility on large datasets. We also observed that taxon sampling impacted supertree accuracy, with poor results obtained when all of the source trees were only sparsely sampled. Finally, we showed that the popular optimality criterion of minimizing the total topological distance of the supertree to the source trees is only weakly correlated with supertree topological accuracy. Therefore evaluating supertree methods on biological datasets is problematic.Conclusions: Our results show that supertree methods that improve upon MRP are possible, and that an effort should be made to produce scalable and robust implementations of the most accurate supertree methods. Also, because topological accuracy depends upon taxon sampling strategies, attempts to construct very large phylogenetic trees using supertree methods should consider the selection of source tree datasets, as well as supertree methods. Finally, since supertree topological error is only weakly correlated with the supertree's topological distance to its source trees, development and testing of supertree methods presents methodological challenges.
UR - http://www.scopus.com/inward/record.url?scp=79955102024&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79955102024&partnerID=8YFLogxK
U2 - 10.1186/1748-7188-6-7
DO - 10.1186/1748-7188-6-7
M3 - Article
C2 - 21504600
AN - SCOPUS:79955102024
SN - 1748-7188
VL - 6
JO - Algorithms for Molecular Biology
JF - Algorithms for Molecular Biology
IS - 1
M1 - 7
ER -