A simulation study comparing supertree and combined analysis methods using SMIDGen

M. Shel Swenson, François Barbançon, C. Randal Linder, Tandy Warnow

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Supertree methods comprise one approach to reconstructing large molecular phylogenies given estimated source trees for overlapping subsets of the entire set of taxa. These source trees are combined into a single supertree on the full set of taxa using various algorithmic techniques, with the most common being matrix representation with parsimony (MRP). When the data allow, the competing approach is a combined analysis (also known as a "supermatrix" or "total evidence" approach) whereby the different sequence data matrices for each of the different subsets of taxa are concatenated into a single supermatrix, and a tree is estimated on that supermatrix. In this paper, we report an extensive simulation study comparing the supertree methods MRP and weighted MRP against combined analysis methods on large model trees, using a novel simulation methodology (Super-Method Input Data Generator, or SMIDGen), which better reflects biological processes and the practices of systematists. This study shows that combined analysis based upon maximum likelihood outperforms all the other methods, giving especially big improvements when the largest subtree does not contain most of the taxa.

Original languageEnglish (US)
Title of host publicationAlgorithms in Bioinformatics - 9th International Workshop, WABI 2009, Proceedings
Pages333-344
Number of pages12
DOIs
StatePublished - Nov 2 2009
Externally publishedYes
Event9th International Workshop on Algorithms in Bioinformatics, WABI 2009 - Philadelphia, PA, United States
Duration: Sep 12 2009Sep 13 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5724 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other9th International Workshop on Algorithms in Bioinformatics, WABI 2009
CountryUnited States
CityPhiladelphia, PA
Period9/12/099/13/09

Fingerprint

Parsimony
Simulation Study
Matrix Representation
Subset
Maximum likelihood
Phylogeny
Maximum Likelihood
Overlapping
Entire
Generator
Methodology
Simulation
Model

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Swenson, M. S., Barbançon, F., Linder, C. R., & Warnow, T. (2009). A simulation study comparing supertree and combined analysis methods using SMIDGen. In Algorithms in Bioinformatics - 9th International Workshop, WABI 2009, Proceedings (pp. 333-344). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5724 LNBI). https://doi.org/10.1007/978-3-642-04241-6_28

A simulation study comparing supertree and combined analysis methods using SMIDGen. / Swenson, M. Shel; Barbançon, François; Linder, C. Randal; Warnow, Tandy.

Algorithms in Bioinformatics - 9th International Workshop, WABI 2009, Proceedings. 2009. p. 333-344 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5724 LNBI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Swenson, MS, Barbançon, F, Linder, CR & Warnow, T 2009, A simulation study comparing supertree and combined analysis methods using SMIDGen. in Algorithms in Bioinformatics - 9th International Workshop, WABI 2009, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5724 LNBI, pp. 333-344, 9th International Workshop on Algorithms in Bioinformatics, WABI 2009, Philadelphia, PA, United States, 9/12/09. https://doi.org/10.1007/978-3-642-04241-6_28
Swenson MS, Barbançon F, Linder CR, Warnow T. A simulation study comparing supertree and combined analysis methods using SMIDGen. In Algorithms in Bioinformatics - 9th International Workshop, WABI 2009, Proceedings. 2009. p. 333-344. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-04241-6_28
Swenson, M. Shel ; Barbançon, François ; Linder, C. Randal ; Warnow, Tandy. / A simulation study comparing supertree and combined analysis methods using SMIDGen. Algorithms in Bioinformatics - 9th International Workshop, WABI 2009, Proceedings. 2009. pp. 333-344 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{fc88b390805747fab7b07a49376e4b8a,
title = "A simulation study comparing supertree and combined analysis methods using SMIDGen",
abstract = "Supertree methods comprise one approach to reconstructing large molecular phylogenies given estimated source trees for overlapping subsets of the entire set of taxa. These source trees are combined into a single supertree on the full set of taxa using various algorithmic techniques, with the most common being matrix representation with parsimony (MRP). When the data allow, the competing approach is a combined analysis (also known as a {"}supermatrix{"} or {"}total evidence{"} approach) whereby the different sequence data matrices for each of the different subsets of taxa are concatenated into a single supermatrix, and a tree is estimated on that supermatrix. In this paper, we report an extensive simulation study comparing the supertree methods MRP and weighted MRP against combined analysis methods on large model trees, using a novel simulation methodology (Super-Method Input Data Generator, or SMIDGen), which better reflects biological processes and the practices of systematists. This study shows that combined analysis based upon maximum likelihood outperforms all the other methods, giving especially big improvements when the largest subtree does not contain most of the taxa.",
author = "Swenson, {M. Shel} and Fran{\cc}ois Barban{\cc}on and Linder, {C. Randal} and Tandy Warnow",
year = "2009",
month = "11",
day = "2",
doi = "10.1007/978-3-642-04241-6_28",
language = "English (US)",
isbn = "3642042406",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "333--344",
booktitle = "Algorithms in Bioinformatics - 9th International Workshop, WABI 2009, Proceedings",

}

TY - GEN

T1 - A simulation study comparing supertree and combined analysis methods using SMIDGen

AU - Swenson, M. Shel

AU - Barbançon, François

AU - Linder, C. Randal

AU - Warnow, Tandy

PY - 2009/11/2

Y1 - 2009/11/2

N2 - Supertree methods comprise one approach to reconstructing large molecular phylogenies given estimated source trees for overlapping subsets of the entire set of taxa. These source trees are combined into a single supertree on the full set of taxa using various algorithmic techniques, with the most common being matrix representation with parsimony (MRP). When the data allow, the competing approach is a combined analysis (also known as a "supermatrix" or "total evidence" approach) whereby the different sequence data matrices for each of the different subsets of taxa are concatenated into a single supermatrix, and a tree is estimated on that supermatrix. In this paper, we report an extensive simulation study comparing the supertree methods MRP and weighted MRP against combined analysis methods on large model trees, using a novel simulation methodology (Super-Method Input Data Generator, or SMIDGen), which better reflects biological processes and the practices of systematists. This study shows that combined analysis based upon maximum likelihood outperforms all the other methods, giving especially big improvements when the largest subtree does not contain most of the taxa.

AB - Supertree methods comprise one approach to reconstructing large molecular phylogenies given estimated source trees for overlapping subsets of the entire set of taxa. These source trees are combined into a single supertree on the full set of taxa using various algorithmic techniques, with the most common being matrix representation with parsimony (MRP). When the data allow, the competing approach is a combined analysis (also known as a "supermatrix" or "total evidence" approach) whereby the different sequence data matrices for each of the different subsets of taxa are concatenated into a single supermatrix, and a tree is estimated on that supermatrix. In this paper, we report an extensive simulation study comparing the supertree methods MRP and weighted MRP against combined analysis methods on large model trees, using a novel simulation methodology (Super-Method Input Data Generator, or SMIDGen), which better reflects biological processes and the practices of systematists. This study shows that combined analysis based upon maximum likelihood outperforms all the other methods, giving especially big improvements when the largest subtree does not contain most of the taxa.

UR - http://www.scopus.com/inward/record.url?scp=70350349003&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70350349003&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-04241-6_28

DO - 10.1007/978-3-642-04241-6_28

M3 - Conference contribution

AN - SCOPUS:70350349003

SN - 3642042406

SN - 9783642042409

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 333

EP - 344

BT - Algorithms in Bioinformatics - 9th International Workshop, WABI 2009, Proceedings

ER -