TY - JOUR
T1 - Importance of genetic architecture in marker selection decisions for genomic prediction
AU - Della Coletta, Rafael
AU - Fernandes, Samuel B.
AU - Monnahan, Patrick J.
AU - Mikel, Mark A.
AU - Bohn, Martin O.
AU - Lipka, Alexander E.
AU - Hirsch, Candice N.
N1 - This work was supported by the United States Department of Agriculture (2018-67013-27571), the National Science Foundation (IOS-1546727), and the Minnesota Agricultural Experiment Station. RDC was supported by the University of Minnesota MnDRIVE Global Food Ventures Graduate Fellowship and the University of Minnesota Doctoral Dissertation Fellowship.
We thank DOW AgroScience (now Corteva Agriscience) for providing in-kind support through the custom Illumina Infinium 20k SNP chip. We thank the Minnesota Supercomputing Institute at the University of Minnesota (http://www.msi.umn.edu) for providing resources that contributed to the research results reported in this article.
PY - 2023/11
Y1 - 2023/11
N2 - Key message: We demonstrate potential for improved multi-environment genomic prediction accuracy using structural variant markers. However, the degree of observed improvement is highly dependent on the genetic architecture of the trait. Abstract: Breeders commonly use genetic markers to predict the performance of untested individuals as a way to improve the efficiency of breeding programs. These genomic prediction models have almost exclusively used single nucleotide polymorphisms (SNPs) as their source of genetic information, even though other types of markers exist, such as structural variants (SVs). Given that SVs are associated with environmental adaptation and not all of them are in linkage disequilibrium to SNPs, SVs have the potential to bring additional information to multi-environment prediction models that are not captured by SNPs alone. Here, we evaluated different marker types (SNPs and/or SVs) on prediction accuracy across a range of genetic architectures for simulated traits across multiple environments. Our results show that SVs can improve prediction accuracy, but it is highly dependent on the genetic architecture of the trait and the relative gain in accuracy is minimal. When SVs are the only causative variant type, 70% of the time SV predictors outperform SNP predictors. However, the improvement in accuracy in these instances is only 1.5% on average. Further simulations with predictors in varying degrees of LD with causative variants of different types (e.g., SNPs, SVs, SNPs and SVs) showed that prediction accuracy increased as linkage disequilibrium between causative variants and predictors increased regardless of the marker type. This study demonstrates that knowing the genetic architecture of a trait in deciding what markers to use in large-scale genomic prediction modeling in a breeding program is more important than what types of markers to use.
AB - Key message: We demonstrate potential for improved multi-environment genomic prediction accuracy using structural variant markers. However, the degree of observed improvement is highly dependent on the genetic architecture of the trait. Abstract: Breeders commonly use genetic markers to predict the performance of untested individuals as a way to improve the efficiency of breeding programs. These genomic prediction models have almost exclusively used single nucleotide polymorphisms (SNPs) as their source of genetic information, even though other types of markers exist, such as structural variants (SVs). Given that SVs are associated with environmental adaptation and not all of them are in linkage disequilibrium to SNPs, SVs have the potential to bring additional information to multi-environment prediction models that are not captured by SNPs alone. Here, we evaluated different marker types (SNPs and/or SVs) on prediction accuracy across a range of genetic architectures for simulated traits across multiple environments. Our results show that SVs can improve prediction accuracy, but it is highly dependent on the genetic architecture of the trait and the relative gain in accuracy is minimal. When SVs are the only causative variant type, 70% of the time SV predictors outperform SNP predictors. However, the improvement in accuracy in these instances is only 1.5% on average. Further simulations with predictors in varying degrees of LD with causative variants of different types (e.g., SNPs, SVs, SNPs and SVs) showed that prediction accuracy increased as linkage disequilibrium between causative variants and predictors increased regardless of the marker type. This study demonstrates that knowing the genetic architecture of a trait in deciding what markers to use in large-scale genomic prediction modeling in a breeding program is more important than what types of markers to use.
KW - Genotype-by-environment
KW - Maize
KW - Plant breeding
KW - Simulation
KW - Structural variation
UR - http://www.scopus.com/inward/record.url?scp=85173605940&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85173605940&partnerID=8YFLogxK
U2 - 10.1007/s00122-023-04469-w
DO - 10.1007/s00122-023-04469-w
M3 - Article
C2 - 37819415
AN - SCOPUS:85173605940
SN - 0040-5752
VL - 136
JO - Theoretical and Applied Genetics
JF - Theoretical and Applied Genetics
IS - 11
M1 - 220
ER -