TY - JOUR
T1 - Simulation with RADinitio improves RADseq experimental design and sheds light on sources of missing data
AU - Rivera-Colón, Angel G.
AU - Rochette, Nicolas C.
AU - Catchen, Julian M.
N1 - Publisher Copyright:
© 2020 John Wiley & Sons Ltd
PY - 2021/2
Y1 - 2021/2
N2 - Restriction-site associated DNA sequencing (RADseq) has become a powerful and versatile tool in modern population genomics, enabling large-scale evolutionary and genomic analyses in otherwise inaccessible biological systems. With its widespread use, different variants on the protocol have been developed to suit specific experimental needs. Researchers face the challenge of choosing the optimal molecular and sequencing protocols for their reduced representation experimental design, an often-complicated process. Strategic errors can lead to biased data generation that has reduced power to answer biological questions. Here, we present RADinitio, simulation software for the selection and optimization of RADseq experiments via the generation of sequencing data that behave similarly to empirical sources. RADinitio provides an evolutionary simulation of populations, implementation of various RADseq protocols with customizable parameters, and thorough assessment of missing data. We test the efficacy of the software using different RAD protocols across several organisms, highlighting the importance of protocol selection on the magnitude and quality of data acquired. Additionally, we test the effects of RAD library preparation and sequencing on allelic dropout, observing that library preparation and sequencing often contributes more to missing alleles than population-level variation.
AB - Restriction-site associated DNA sequencing (RADseq) has become a powerful and versatile tool in modern population genomics, enabling large-scale evolutionary and genomic analyses in otherwise inaccessible biological systems. With its widespread use, different variants on the protocol have been developed to suit specific experimental needs. Researchers face the challenge of choosing the optimal molecular and sequencing protocols for their reduced representation experimental design, an often-complicated process. Strategic errors can lead to biased data generation that has reduced power to answer biological questions. Here, we present RADinitio, simulation software for the selection and optimization of RADseq experiments via the generation of sequencing data that behave similarly to empirical sources. RADinitio provides an evolutionary simulation of populations, implementation of various RADseq protocols with customizable parameters, and thorough assessment of missing data. We test the efficacy of the software using different RAD protocols across several organisms, highlighting the importance of protocol selection on the magnitude and quality of data acquired. Additionally, we test the effects of RAD library preparation and sequencing on allelic dropout, observing that library preparation and sequencing often contributes more to missing alleles than population-level variation.
KW - RADseq
KW - bioinformatics
KW - genetics
KW - population
KW - simulations
UR - http://www.scopus.com/inward/record.url?scp=85084987161&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85084987161&partnerID=8YFLogxK
U2 - 10.1111/1755-0998.13163
DO - 10.1111/1755-0998.13163
M3 - Article
C2 - 32275349
AN - SCOPUS:85084987161
SN - 1755-098X
VL - 21
SP - 363
EP - 378
JO - Molecular ecology resources
JF - Molecular ecology resources
IS - 2
ER -