Parsimony is hard to beat!

Kenneth Rice, Tandy Warnow

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The estimation of evolutionary history from biomolecular sequences is a major intellectual project in systematic biology and many methods are used to reconstruct phylogenetic (i.e. evolutionary) trees from sequence data. In this paper, we report on an extensive performance analysis of parsimony and two distance-based methods, a popular method called neighbor joining, and a new method developed by Agarwala et al. which approximates the L ∞-nearest tree, on more than 260,000 sequence data sets simulated on approximately 500 model trees. Our experiments indicate a decrease in statistical power of the two distance methods as the diameter grows, but also show that parsimony is not as badly affected by the diameter as the distance methods. More generally, the experiments indicate that parsimony is almost always more accurate than the other two methods on reasonable length sequences even under adverse conditions, such as having sites that evolve quickly within the tree, pairs of taxa with large evolutionary distances between them, or large ratios between the highest and the lowest substitution rates on the edges.

Original languageEnglish (US)
Title of host publicationComputing and Combinatorics - 3rd Annual International Conference COCOON 1997, Proceedings
EditorsTao Jiang, D.T. Lee
PublisherSpringer-Verlag
Pages124-133
Number of pages10
ISBN (Print)354063357X, 9783540633570
StatePublished - Jan 1 1997
Externally publishedYes
Event3rd Annual International Computing and Combinatorics Conference, COCOON 1997 - Shanghai, China
Duration: Aug 20 1997Aug 22 1997

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1276
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other3rd Annual International Computing and Combinatorics Conference, COCOON 1997
CountryChina
CityShanghai
Period8/20/978/22/97

Fingerprint

Parsimony
Beat
Joining
Substitution reactions
Experiments
Evolutionary Tree
Statistical Power
Phylogenetic Tree
Biology
Performance Analysis
Experiment
Substitution
Lowest
Decrease

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Rice, K., & Warnow, T. (1997). Parsimony is hard to beat! In T. Jiang, & D. T. Lee (Eds.), Computing and Combinatorics - 3rd Annual International Conference COCOON 1997, Proceedings (pp. 124-133). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1276). Springer-Verlag.

Parsimony is hard to beat! / Rice, Kenneth; Warnow, Tandy.

Computing and Combinatorics - 3rd Annual International Conference COCOON 1997, Proceedings. ed. / Tao Jiang; D.T. Lee. Springer-Verlag, 1997. p. 124-133 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1276).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Rice, K & Warnow, T 1997, Parsimony is hard to beat! in T Jiang & DT Lee (eds), Computing and Combinatorics - 3rd Annual International Conference COCOON 1997, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 1276, Springer-Verlag, pp. 124-133, 3rd Annual International Computing and Combinatorics Conference, COCOON 1997, Shanghai, China, 8/20/97.
Rice K, Warnow T. Parsimony is hard to beat! In Jiang T, Lee DT, editors, Computing and Combinatorics - 3rd Annual International Conference COCOON 1997, Proceedings. Springer-Verlag. 1997. p. 124-133. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Rice, Kenneth ; Warnow, Tandy. / Parsimony is hard to beat!. Computing and Combinatorics - 3rd Annual International Conference COCOON 1997, Proceedings. editor / Tao Jiang ; D.T. Lee. Springer-Verlag, 1997. pp. 124-133 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{81c834793821443697b6b217ccd7aa26,
title = "Parsimony is hard to beat!",
abstract = "The estimation of evolutionary history from biomolecular sequences is a major intellectual project in systematic biology and many methods are used to reconstruct phylogenetic (i.e. evolutionary) trees from sequence data. In this paper, we report on an extensive performance analysis of parsimony and two distance-based methods, a popular method called neighbor joining, and a new method developed by Agarwala et al. which approximates the L ∞-nearest tree, on more than 260,000 sequence data sets simulated on approximately 500 model trees. Our experiments indicate a decrease in statistical power of the two distance methods as the diameter grows, but also show that parsimony is not as badly affected by the diameter as the distance methods. More generally, the experiments indicate that parsimony is almost always more accurate than the other two methods on reasonable length sequences even under adverse conditions, such as having sites that evolve quickly within the tree, pairs of taxa with large evolutionary distances between them, or large ratios between the highest and the lowest substitution rates on the edges.",
author = "Kenneth Rice and Tandy Warnow",
year = "1997",
month = "1",
day = "1",
language = "English (US)",
isbn = "354063357X",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer-Verlag",
pages = "124--133",
editor = "Tao Jiang and D.T. Lee",
booktitle = "Computing and Combinatorics - 3rd Annual International Conference COCOON 1997, Proceedings",

}

TY - GEN

T1 - Parsimony is hard to beat!

AU - Rice, Kenneth

AU - Warnow, Tandy

PY - 1997/1/1

Y1 - 1997/1/1

N2 - The estimation of evolutionary history from biomolecular sequences is a major intellectual project in systematic biology and many methods are used to reconstruct phylogenetic (i.e. evolutionary) trees from sequence data. In this paper, we report on an extensive performance analysis of parsimony and two distance-based methods, a popular method called neighbor joining, and a new method developed by Agarwala et al. which approximates the L ∞-nearest tree, on more than 260,000 sequence data sets simulated on approximately 500 model trees. Our experiments indicate a decrease in statistical power of the two distance methods as the diameter grows, but also show that parsimony is not as badly affected by the diameter as the distance methods. More generally, the experiments indicate that parsimony is almost always more accurate than the other two methods on reasonable length sequences even under adverse conditions, such as having sites that evolve quickly within the tree, pairs of taxa with large evolutionary distances between them, or large ratios between the highest and the lowest substitution rates on the edges.

AB - The estimation of evolutionary history from biomolecular sequences is a major intellectual project in systematic biology and many methods are used to reconstruct phylogenetic (i.e. evolutionary) trees from sequence data. In this paper, we report on an extensive performance analysis of parsimony and two distance-based methods, a popular method called neighbor joining, and a new method developed by Agarwala et al. which approximates the L ∞-nearest tree, on more than 260,000 sequence data sets simulated on approximately 500 model trees. Our experiments indicate a decrease in statistical power of the two distance methods as the diameter grows, but also show that parsimony is not as badly affected by the diameter as the distance methods. More generally, the experiments indicate that parsimony is almost always more accurate than the other two methods on reasonable length sequences even under adverse conditions, such as having sites that evolve quickly within the tree, pairs of taxa with large evolutionary distances between them, or large ratios between the highest and the lowest substitution rates on the edges.

UR - http://www.scopus.com/inward/record.url?scp=84947793322&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84947793322&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84947793322

SN - 354063357X

SN - 9783540633570

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 124

EP - 133

BT - Computing and Combinatorics - 3rd Annual International Conference COCOON 1997, Proceedings

A2 - Jiang, Tao

A2 - Lee, D.T.

PB - Springer-Verlag

ER -