Distance-based genome rearrangement phylogeny

Li San Wang, Tandy Warnow, Bernard M.E. Moret, Robert K. Jansen, Linda A. Raubeson

Research output: Contribution to journalArticle

Abstract

Evolution operates on whole genomes through direct rearrangements of genes, such as inversions, transpositions, and inverted transpositions, as well as through operations, such as duplications, losses, and transfers, that also affect the gene content of the genomes. Because these events are rare relative to nucleotide substitutions, gene order data offer the possibility of resolving ancient branches in the tree of life; the combination of gene order data with sequence data also has the potential to provide more robust phylogenetic reconstructions, since each can elucidate evolution at different time scales. Distance corrections greatly improve the accuracy of phylogeny reconstructions from DNA sequences, enabling distance-based methods to approach the accuracy of the more elaborate methods based on parsimony or likelihood at a fraction of the computational cost. This paper focuses on developing distance correction methods for phylogeny reconstruction from whole genomes. The main question we investigate is how to estimate evolutionary histories from whole genomes with equal gene content, and we present a technique, the empirically derived estimator (EDE), that we have developed for this purpose. We study the use of EDE on whole genomes with identical gene content, and we explore the accuracy of phylogenies inferred using EDE with the neighbor joining and minimum evolution methods under a wide range of model conditions. Our study shows that tree reconstruction under these two methods is much more accurate when based on EDE distances than when based on other distances previously suggested for whole genomes.

Original languageEnglish (US)
Pages (from-to)473-483
Number of pages11
JournalJournal of Molecular Evolution
Volume63
Issue number4
DOIs
StatePublished - Oct 1 2006
Externally publishedYes

Fingerprint

Phylogeny
phylogeny
genome
Genes
Genome
gene
Gene Order
genes
transposition (genetics)
methodology
Gene Rearrangement
substitution
Nucleotides
nucleotides
method
timescale
phylogenetics
nucleotide sequences
DNA
Costs and Cost Analysis

Keywords

  • Breakpoint
  • Distance-based methods
  • Fast ME
  • Genome rearrangements
  • Inversion
  • Nadeau-Taylor model
  • Neighbor joining

ASJC Scopus subject areas

  • Genetics
  • Biochemistry
  • Biochemistry, Genetics and Molecular Biology(all)
  • Genetics(clinical)
  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Agricultural and Biological Sciences(all)
  • Agricultural and Biological Sciences (miscellaneous)

Cite this

Wang, L. S., Warnow, T., Moret, B. M. E., Jansen, R. K., & Raubeson, L. A. (2006). Distance-based genome rearrangement phylogeny. Journal of Molecular Evolution, 63(4), 473-483. https://doi.org/10.1007/s00239-005-0216-y

Distance-based genome rearrangement phylogeny. / Wang, Li San; Warnow, Tandy; Moret, Bernard M.E.; Jansen, Robert K.; Raubeson, Linda A.

In: Journal of Molecular Evolution, Vol. 63, No. 4, 01.10.2006, p. 473-483.

Research output: Contribution to journalArticle

Wang, LS, Warnow, T, Moret, BME, Jansen, RK & Raubeson, LA 2006, 'Distance-based genome rearrangement phylogeny', Journal of Molecular Evolution, vol. 63, no. 4, pp. 473-483. https://doi.org/10.1007/s00239-005-0216-y
Wang, Li San ; Warnow, Tandy ; Moret, Bernard M.E. ; Jansen, Robert K. ; Raubeson, Linda A. / Distance-based genome rearrangement phylogeny. In: Journal of Molecular Evolution. 2006 ; Vol. 63, No. 4. pp. 473-483.
@article{26902534810341ff925c0bf35bbd254e,
title = "Distance-based genome rearrangement phylogeny",
abstract = "Evolution operates on whole genomes through direct rearrangements of genes, such as inversions, transpositions, and inverted transpositions, as well as through operations, such as duplications, losses, and transfers, that also affect the gene content of the genomes. Because these events are rare relative to nucleotide substitutions, gene order data offer the possibility of resolving ancient branches in the tree of life; the combination of gene order data with sequence data also has the potential to provide more robust phylogenetic reconstructions, since each can elucidate evolution at different time scales. Distance corrections greatly improve the accuracy of phylogeny reconstructions from DNA sequences, enabling distance-based methods to approach the accuracy of the more elaborate methods based on parsimony or likelihood at a fraction of the computational cost. This paper focuses on developing distance correction methods for phylogeny reconstruction from whole genomes. The main question we investigate is how to estimate evolutionary histories from whole genomes with equal gene content, and we present a technique, the empirically derived estimator (EDE), that we have developed for this purpose. We study the use of EDE on whole genomes with identical gene content, and we explore the accuracy of phylogenies inferred using EDE with the neighbor joining and minimum evolution methods under a wide range of model conditions. Our study shows that tree reconstruction under these two methods is much more accurate when based on EDE distances than when based on other distances previously suggested for whole genomes.",
keywords = "Breakpoint, Distance-based methods, Fast ME, Genome rearrangements, Inversion, Nadeau-Taylor model, Neighbor joining",
author = "Wang, {Li San} and Tandy Warnow and Moret, {Bernard M.E.} and Jansen, {Robert K.} and Raubeson, {Linda A.}",
year = "2006",
month = "10",
day = "1",
doi = "10.1007/s00239-005-0216-y",
language = "English (US)",
volume = "63",
pages = "473--483",
journal = "Journal of Molecular Evolution",
issn = "0022-2844",
publisher = "Springer New York",
number = "4",

}

TY - JOUR

T1 - Distance-based genome rearrangement phylogeny

AU - Wang, Li San

AU - Warnow, Tandy

AU - Moret, Bernard M.E.

AU - Jansen, Robert K.

AU - Raubeson, Linda A.

PY - 2006/10/1

Y1 - 2006/10/1

N2 - Evolution operates on whole genomes through direct rearrangements of genes, such as inversions, transpositions, and inverted transpositions, as well as through operations, such as duplications, losses, and transfers, that also affect the gene content of the genomes. Because these events are rare relative to nucleotide substitutions, gene order data offer the possibility of resolving ancient branches in the tree of life; the combination of gene order data with sequence data also has the potential to provide more robust phylogenetic reconstructions, since each can elucidate evolution at different time scales. Distance corrections greatly improve the accuracy of phylogeny reconstructions from DNA sequences, enabling distance-based methods to approach the accuracy of the more elaborate methods based on parsimony or likelihood at a fraction of the computational cost. This paper focuses on developing distance correction methods for phylogeny reconstruction from whole genomes. The main question we investigate is how to estimate evolutionary histories from whole genomes with equal gene content, and we present a technique, the empirically derived estimator (EDE), that we have developed for this purpose. We study the use of EDE on whole genomes with identical gene content, and we explore the accuracy of phylogenies inferred using EDE with the neighbor joining and minimum evolution methods under a wide range of model conditions. Our study shows that tree reconstruction under these two methods is much more accurate when based on EDE distances than when based on other distances previously suggested for whole genomes.

AB - Evolution operates on whole genomes through direct rearrangements of genes, such as inversions, transpositions, and inverted transpositions, as well as through operations, such as duplications, losses, and transfers, that also affect the gene content of the genomes. Because these events are rare relative to nucleotide substitutions, gene order data offer the possibility of resolving ancient branches in the tree of life; the combination of gene order data with sequence data also has the potential to provide more robust phylogenetic reconstructions, since each can elucidate evolution at different time scales. Distance corrections greatly improve the accuracy of phylogeny reconstructions from DNA sequences, enabling distance-based methods to approach the accuracy of the more elaborate methods based on parsimony or likelihood at a fraction of the computational cost. This paper focuses on developing distance correction methods for phylogeny reconstruction from whole genomes. The main question we investigate is how to estimate evolutionary histories from whole genomes with equal gene content, and we present a technique, the empirically derived estimator (EDE), that we have developed for this purpose. We study the use of EDE on whole genomes with identical gene content, and we explore the accuracy of phylogenies inferred using EDE with the neighbor joining and minimum evolution methods under a wide range of model conditions. Our study shows that tree reconstruction under these two methods is much more accurate when based on EDE distances than when based on other distances previously suggested for whole genomes.

KW - Breakpoint

KW - Distance-based methods

KW - Fast ME

KW - Genome rearrangements

KW - Inversion

KW - Nadeau-Taylor model

KW - Neighbor joining

UR - http://www.scopus.com/inward/record.url?scp=33750279466&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33750279466&partnerID=8YFLogxK

U2 - 10.1007/s00239-005-0216-y

DO - 10.1007/s00239-005-0216-y

M3 - Article

C2 - 17021931

AN - SCOPUS:33750279466

VL - 63

SP - 473

EP - 483

JO - Journal of Molecular Evolution

JF - Journal of Molecular Evolution

SN - 0022-2844

IS - 4

ER -