Lost in parameter space: a road map for stacks

Josephine R. Paris, Jamie R. Stevens, Julian M. Catchen

Research output: Contribution to journalArticle

Abstract

Restriction site-Associated DNA sequencing (RAD-seq) has become a widely adopted method for genotyping populations of model and non-model organisms. Generating a reliable set of loci for downstream analysis requires appropriate use of bioinformatics software, such as the program stacks. Using three empirical RAD-seq datasets, we demonstrate a method for optimising a de novo assembly of loci using stacks. By iterating values of the program's main parameters and plotting resultant core metrics for visualisation, researchers can gain a much better understanding of their dataset and select an optimal set of parameters; we present the 80% rule as a generally effective method to select the core parameters for stacks. Visualisation of the metrics plotted for the three RAD-seq datasets shows that they differ in the optimal parameters that should be used to maximise the amount of available biological information. We also demonstrate that building loci de novo and then integrating alignment positions is more effective than aligning raw reads directly to a reference genome. Our methods will help the community in honing the analytical skills necessary to accurately assemble a RAD-seq dataset.

Original languageEnglish (US)
Pages (from-to)1360-1373
Number of pages14
JournalMethods in Ecology and Evolution
Volume8
Issue number10
DOIs
StatePublished - Oct 2017

Keywords

  • RAD-seq
  • alignment
  • de novo assembly
  • parameter optimisation
  • population genetics
  • stacks

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Ecological Modeling

Fingerprint Dive into the research topics of 'Lost in parameter space: a road map for stacks'. Together they form a unique fingerprint.

  • Cite this