TY - JOUR
T1 - An Assessment of the Factors Influencing the Prediction Accuracy of Genomic Prediction Models Across Multiple Environments
AU - Widener, Sarah
AU - Graef, George
AU - Lipka, Alexander E.
AU - Jarquin, Diego
N1 - Funding Information:
Funding. This research was funded by USDA-NIFA Grant Number 2018-68005-27937. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the USDA. The USDA is an equal opportunity provider and employer.
Publisher Copyright:
© Copyright © 2021 Widener, Graef, Lipka and Jarquin.
PY - 2021/7/23
Y1 - 2021/7/23
N2 - The effects of climate change create formidable challenges for breeders striving to produce sufficient food quantities in rapidly changing environments. It is therefore critical to investigate the ability of multi-environment genomic prediction (GP) models to predict genomic estimated breeding values (GEBVs) in extreme environments. Exploration of the impact of training set composition on the accuracy of such GEBVs is also essential. Accordingly, we examined the influence of the number of training environments and the use of environmental covariates (ECs) in GS models on four subsets of n = 500 lines of the soybean nested association mapping (SoyNAM) panel grown in nine environments in the US-North Central Region. The ensuing analyses provided insights into the influence of both of these factors for predicting grain yield in the most and the least extreme of these environments. We found that only a subset of the available environments was needed to obtain the highest observed prediction accuracies. The inclusion of ECs in the GP model did not substantially increase prediction accuracies relative to competing models, and instead more often resulted in negative prediction accuracies. Combined with the overall low prediction accuracies for grain yield in the most extreme environment, our findings highlight weaknesses in current GP approaches for prediction in extreme environments, and point to specific areas on which to focus future research efforts.
AB - The effects of climate change create formidable challenges for breeders striving to produce sufficient food quantities in rapidly changing environments. It is therefore critical to investigate the ability of multi-environment genomic prediction (GP) models to predict genomic estimated breeding values (GEBVs) in extreme environments. Exploration of the impact of training set composition on the accuracy of such GEBVs is also essential. Accordingly, we examined the influence of the number of training environments and the use of environmental covariates (ECs) in GS models on four subsets of n = 500 lines of the soybean nested association mapping (SoyNAM) panel grown in nine environments in the US-North Central Region. The ensuing analyses provided insights into the influence of both of these factors for predicting grain yield in the most and the least extreme of these environments. We found that only a subset of the available environments was needed to obtain the highest observed prediction accuracies. The inclusion of ECs in the GP model did not substantially increase prediction accuracies relative to competing models, and instead more often resulted in negative prediction accuracies. Combined with the overall low prediction accuracies for grain yield in the most extreme environment, our findings highlight weaknesses in current GP approaches for prediction in extreme environments, and point to specific areas on which to focus future research efforts.
KW - environmental covariates (ECs)
KW - extreme environmental conditions
KW - genomic selection (GS)
KW - genotype-by-environment (GE) interaction
KW - soybean nested association mapping (SoyNAM) populations
UR - http://www.scopus.com/inward/record.url?scp=85112131068&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85112131068&partnerID=8YFLogxK
U2 - 10.3389/fgene.2021.689319
DO - 10.3389/fgene.2021.689319
M3 - Article
C2 - 34367248
AN - SCOPUS:85112131068
SN - 1664-8021
VL - 12
JO - Frontiers in Genetics
JF - Frontiers in Genetics
M1 - 689319
ER -