A computer simulation analysis of the accuracy of partial genome sequencing and restriction fragment analysis in the reconstruction of phylogenetic relationships

Baozhen Qiao, Tony L. Goldberg, Gary J. Olsen, Ronald M. Weigel

Research output: Contribution to journalArticle

Abstract

Partial genome sequencing (PGS) and restriction fragment analysis (RFA) are used frequently in molecular epidemiologic investigations. The relative accuracy of PGS and RFA in phylogenetic reconstruction has not been assessed. In this study, 32 model phylogenetic trees with 16 extant lineages were generated, for which DNA sequences were simulated under varying conditions of genome length, nucleotide substitution rate, and between-site substitution rate variation. Genotyping using PGS and RFA was simulated. The effect of tree structure (stemminess, imbalance, lineage variation) on the accuracy of phylogenetic reconstruction (topological and branch length similarity) was evaluated. Overall, PGS was more accurate than RFA. The accuracy of PGS increased with increasing sequence length. The accuracy of RFA increased with the number of restriction enzymes used. In fragment size comparison, the Dice and Nei-Li algorithms differed little, with both more accurate than the Fragment Size Distribution algorithm. For RFA, higher tree stemminess and longer genome length were associated with higher topological accuracy, whereas lower tree stemminess and lower substitution rates were associated with higher branch length accuracy. For PGS, lower tree imbalance was associated with higher topological accuracy, whereas lower tree stemminess, higher substitution rate, and lower between-site substitution rate variation were associated with higher branch length accuracy. RFA had higher topological accuracy than PGS only for the shortest sequence length (200 bps) at a low substitution rate, high tree stemminess, and long genome length. PGS had equal or higher accuracy in branch length reconstruction than RFA under all conditions investigated. Thus, partial genome sequencing is recommended over restriction fragment analysis for conditions within the parameter space examined.

Original languageEnglish (US)
Pages (from-to)323-330
Number of pages8
JournalInfection, Genetics and Evolution
Volume6
Issue number4
DOIs
StatePublished - Jul 1 2006

Fingerprint

computer simulation
Computer Simulation
genome
Genome
phylogenetics
phylogeny
substitution
analysis
genotyping
rate
Nucleotides
nucleotides
enzyme
nucleotide sequences
DNA

Keywords

  • Computer simulation
  • Disease transmission
  • Molecular epidemiology
  • Partial genome sequencing
  • Phylogenetic reconstruction
  • Restriction fragment analysis

ASJC Scopus subject areas

  • Microbiology
  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Genetics
  • Microbiology (medical)
  • Infectious Diseases

Cite this

A computer simulation analysis of the accuracy of partial genome sequencing and restriction fragment analysis in the reconstruction of phylogenetic relationships. / Qiao, Baozhen; Goldberg, Tony L.; Olsen, Gary J.; Weigel, Ronald M.

In: Infection, Genetics and Evolution, Vol. 6, No. 4, 01.07.2006, p. 323-330.

Research output: Contribution to journalArticle

@article{4dcf1eb4c34349f087a51bc4520bb263,
title = "A computer simulation analysis of the accuracy of partial genome sequencing and restriction fragment analysis in the reconstruction of phylogenetic relationships",
abstract = "Partial genome sequencing (PGS) and restriction fragment analysis (RFA) are used frequently in molecular epidemiologic investigations. The relative accuracy of PGS and RFA in phylogenetic reconstruction has not been assessed. In this study, 32 model phylogenetic trees with 16 extant lineages were generated, for which DNA sequences were simulated under varying conditions of genome length, nucleotide substitution rate, and between-site substitution rate variation. Genotyping using PGS and RFA was simulated. The effect of tree structure (stemminess, imbalance, lineage variation) on the accuracy of phylogenetic reconstruction (topological and branch length similarity) was evaluated. Overall, PGS was more accurate than RFA. The accuracy of PGS increased with increasing sequence length. The accuracy of RFA increased with the number of restriction enzymes used. In fragment size comparison, the Dice and Nei-Li algorithms differed little, with both more accurate than the Fragment Size Distribution algorithm. For RFA, higher tree stemminess and longer genome length were associated with higher topological accuracy, whereas lower tree stemminess and lower substitution rates were associated with higher branch length accuracy. For PGS, lower tree imbalance was associated with higher topological accuracy, whereas lower tree stemminess, higher substitution rate, and lower between-site substitution rate variation were associated with higher branch length accuracy. RFA had higher topological accuracy than PGS only for the shortest sequence length (200 bps) at a low substitution rate, high tree stemminess, and long genome length. PGS had equal or higher accuracy in branch length reconstruction than RFA under all conditions investigated. Thus, partial genome sequencing is recommended over restriction fragment analysis for conditions within the parameter space examined.",
keywords = "Computer simulation, Disease transmission, Molecular epidemiology, Partial genome sequencing, Phylogenetic reconstruction, Restriction fragment analysis",
author = "Baozhen Qiao and Goldberg, {Tony L.} and Olsen, {Gary J.} and Weigel, {Ronald M.}",
year = "2006",
month = "7",
day = "1",
doi = "10.1016/j.meegid.2005.10.002",
language = "English (US)",
volume = "6",
pages = "323--330",
journal = "Infection, Genetics and Evolution",
issn = "1567-1348",
publisher = "Elsevier",
number = "4",

}

TY - JOUR

T1 - A computer simulation analysis of the accuracy of partial genome sequencing and restriction fragment analysis in the reconstruction of phylogenetic relationships

AU - Qiao, Baozhen

AU - Goldberg, Tony L.

AU - Olsen, Gary J.

AU - Weigel, Ronald M.

PY - 2006/7/1

Y1 - 2006/7/1

N2 - Partial genome sequencing (PGS) and restriction fragment analysis (RFA) are used frequently in molecular epidemiologic investigations. The relative accuracy of PGS and RFA in phylogenetic reconstruction has not been assessed. In this study, 32 model phylogenetic trees with 16 extant lineages were generated, for which DNA sequences were simulated under varying conditions of genome length, nucleotide substitution rate, and between-site substitution rate variation. Genotyping using PGS and RFA was simulated. The effect of tree structure (stemminess, imbalance, lineage variation) on the accuracy of phylogenetic reconstruction (topological and branch length similarity) was evaluated. Overall, PGS was more accurate than RFA. The accuracy of PGS increased with increasing sequence length. The accuracy of RFA increased with the number of restriction enzymes used. In fragment size comparison, the Dice and Nei-Li algorithms differed little, with both more accurate than the Fragment Size Distribution algorithm. For RFA, higher tree stemminess and longer genome length were associated with higher topological accuracy, whereas lower tree stemminess and lower substitution rates were associated with higher branch length accuracy. For PGS, lower tree imbalance was associated with higher topological accuracy, whereas lower tree stemminess, higher substitution rate, and lower between-site substitution rate variation were associated with higher branch length accuracy. RFA had higher topological accuracy than PGS only for the shortest sequence length (200 bps) at a low substitution rate, high tree stemminess, and long genome length. PGS had equal or higher accuracy in branch length reconstruction than RFA under all conditions investigated. Thus, partial genome sequencing is recommended over restriction fragment analysis for conditions within the parameter space examined.

AB - Partial genome sequencing (PGS) and restriction fragment analysis (RFA) are used frequently in molecular epidemiologic investigations. The relative accuracy of PGS and RFA in phylogenetic reconstruction has not been assessed. In this study, 32 model phylogenetic trees with 16 extant lineages were generated, for which DNA sequences were simulated under varying conditions of genome length, nucleotide substitution rate, and between-site substitution rate variation. Genotyping using PGS and RFA was simulated. The effect of tree structure (stemminess, imbalance, lineage variation) on the accuracy of phylogenetic reconstruction (topological and branch length similarity) was evaluated. Overall, PGS was more accurate than RFA. The accuracy of PGS increased with increasing sequence length. The accuracy of RFA increased with the number of restriction enzymes used. In fragment size comparison, the Dice and Nei-Li algorithms differed little, with both more accurate than the Fragment Size Distribution algorithm. For RFA, higher tree stemminess and longer genome length were associated with higher topological accuracy, whereas lower tree stemminess and lower substitution rates were associated with higher branch length accuracy. For PGS, lower tree imbalance was associated with higher topological accuracy, whereas lower tree stemminess, higher substitution rate, and lower between-site substitution rate variation were associated with higher branch length accuracy. RFA had higher topological accuracy than PGS only for the shortest sequence length (200 bps) at a low substitution rate, high tree stemminess, and long genome length. PGS had equal or higher accuracy in branch length reconstruction than RFA under all conditions investigated. Thus, partial genome sequencing is recommended over restriction fragment analysis for conditions within the parameter space examined.

KW - Computer simulation

KW - Disease transmission

KW - Molecular epidemiology

KW - Partial genome sequencing

KW - Phylogenetic reconstruction

KW - Restriction fragment analysis

UR - http://www.scopus.com/inward/record.url?scp=33646766475&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33646766475&partnerID=8YFLogxK

U2 - 10.1016/j.meegid.2005.10.002

DO - 10.1016/j.meegid.2005.10.002

M3 - Article

C2 - 16406823

AN - SCOPUS:33646766475

VL - 6

SP - 323

EP - 330

JO - Infection, Genetics and Evolution

JF - Infection, Genetics and Evolution

SN - 1567-1348

IS - 4

ER -