Simulation with RADinitio improves RADseq experimental design and sheds light on sources of missing data

Angel G. Rivera-Colón, Nicolas C. Rochette, Julian M. Catchen

Research output: Contribution to journalArticlepeer-review

Abstract

Restriction-site associated DNA sequencing (RADseq) has become a powerful and versatile tool in modern population genomics, enabling large-scale evolutionary and genomic analyses in otherwise inaccessible biological systems. With its widespread use, different variants on the protocol have been developed to suit specific experimental needs. Researchers face the challenge of choosing the optimal molecular and sequencing protocols for their reduced representation experimental design, an often-complicated process. Strategic errors can lead to biased data generation that has reduced power to answer biological questions. Here, we present RADinitio, simulation software for the selection and optimization of RADseq experiments via the generation of sequencing data that behave similarly to empirical sources. RADinitio provides an evolutionary simulation of populations, implementation of various RADseq protocols with customizable parameters, and thorough assessment of missing data. We test the efficacy of the software using different RAD protocols across several organisms, highlighting the importance of protocol selection on the magnitude and quality of data acquired. Additionally, we test the effects of RAD library preparation and sequencing on allelic dropout, observing that library preparation and sequencing often contributes more to missing alleles than population-level variation.

Original languageEnglish (US)
JournalMolecular ecology resources
DOIs
StateAccepted/In press - 2020

Keywords

  • RADseq
  • bioinformatics
  • genetics
  • population
  • simulations

ASJC Scopus subject areas

  • Biotechnology
  • Ecology, Evolution, Behavior and Systematics
  • Genetics

Fingerprint Dive into the research topics of 'Simulation with RADinitio improves RADseq experimental design and sheds light on sources of missing data'. Together they form a unique fingerprint.

Cite this