TY - JOUR
T1 - Deriving genotypes from RAD-seq short-read data using Stacks
AU - Rochette, Nicolas C.
AU - Catchen, Julian M.
N1 - Publisher Copyright:
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
PY - 2017/12/1
Y1 - 2017/12/1
N2 - R estriction site-associated DNANA sequencing (RARAD-seq) allows for the genome-wide discovery and genotyping of single-nucleotide polymorphisms in hundreds of individuals at a time in model and nonmodel species alike. However, converting short-read sequencing data into reliable genotype data remains a nontrivial task, especially as RARAD-seq is used in systems that have very diverse genomic properties. Here, we present a protocol to analyze RARAD-seq data using the Stacks pipeline. This protocol will be of use in areas such as ecology and population genetics. It covers the assessment and demultiplexing of the sequencing data, read mapping, inference of RARAD loci, genotype calling, and filtering of the output data, as well as providing two simple examples of downstream biological analyses. We place special emphasis on checking the soundness of the procedure and choosing the main parameters, given the properties of the data. The procedure can be completed in 1 week, but determining definitive methodological choices will typically take up to 1 month.
AB - R estriction site-associated DNANA sequencing (RARAD-seq) allows for the genome-wide discovery and genotyping of single-nucleotide polymorphisms in hundreds of individuals at a time in model and nonmodel species alike. However, converting short-read sequencing data into reliable genotype data remains a nontrivial task, especially as RARAD-seq is used in systems that have very diverse genomic properties. Here, we present a protocol to analyze RARAD-seq data using the Stacks pipeline. This protocol will be of use in areas such as ecology and population genetics. It covers the assessment and demultiplexing of the sequencing data, read mapping, inference of RARAD loci, genotype calling, and filtering of the output data, as well as providing two simple examples of downstream biological analyses. We place special emphasis on checking the soundness of the procedure and choosing the main parameters, given the properties of the data. The procedure can be completed in 1 week, but determining definitive methodological choices will typically take up to 1 month.
UR - http://www.scopus.com/inward/record.url?scp=85036609416&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85036609416&partnerID=8YFLogxK
U2 - 10.1038/nprot.2017.123
DO - 10.1038/nprot.2017.123
M3 - Article
C2 - 29189774
AN - SCOPUS:85036609416
SN - 1754-2189
VL - 12
SP - 2640
EP - 2659
JO - Nature Protocols
JF - Nature Protocols
IS - 12
ER -