TY - JOUR
T1 - Genome-enabled prediction of reproductive traits in Nellore cattle using parametric models and machine learning methods
AU - Alves, A. A.C.
AU - Espigolan, R.
AU - Bresolin, T.
AU - Costa, R. M.
AU - Fernandes Júnior, G. A.
AU - Ventura, R. V.
AU - Carvalheiro, R.
AU - Albuquerque, L. G.
N1 - Funding Information:
This research was financially supported by Sao Paulo Research Foundation – FAPESP (grants 2009/16118‐5, 2016/24227‐2, 2017/10630‐2 and 2018/20026‐8) and partially by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – CAPES (finance code 001). The authors would like to thank the breeding programs contributing to Alliance Nellore dataset for providing the necessary data. Finally, the authors acknowledge the reviewers for their valuable comments.
Publisher Copyright:
© 2020 Stichting International Foundation for Animal Genetics
PY - 2021/2
Y1 - 2021/2
N2 - This study aimed to assess the predictive ability of different machine learning (ML) methods for genomic prediction of reproductive traits in Nellore cattle. The studied traits were age at first calving (AFC), scrotal circumference (SC), early pregnancy (EP) and stayability (STAY). The numbers of genotyped animals and SNP markers available were 2342 and 321 419 (AFC), 4671 and 309 486 (SC), 2681 and 319 619 (STAY) and 3356 and 319 108 (EP). Predictive ability of support vector regression (SVR), Bayesian regularized artificial neural network (BRANN) and random forest (RF) were compared with results obtained using parametric models (genomic best linear unbiased predictor, GBLUP, and Bayesian least absolute shrinkage and selection operator, BLASSO). A 5-fold cross-validation strategy was performed and the average prediction accuracy (ACC) and mean squared errors (MSE) were computed. The ACC was defined as the linear correlation between predicted and observed breeding values for categorical traits (EP and STAY) and as the correlation between predicted and observed adjusted phenotypes divided by the square root of the estimated heritability for continuous traits (AFC and SC). The average ACC varied from low to moderate depending on the trait and model under consideration, ranging between 0.56 and 0.63 (AFC), 0.27 and 0.36 (SC), 0.57 and 0.67 (EP), and 0.52 and 0.62 (STAY). SVR provided slightly better accuracies than the parametric models for all traits, increasing the prediction accuracy for AFC to around 6.3 and 4.8% compared with GBLUP and BLASSO respectively. Likewise, there was an increase of 8.3% for SC, 4.5% for EP and 4.8% for STAY, comparing SVR with both GBLUP and BLASSO. In contrast, the RF and BRANN did not present competitive predictive ability compared with the parametric models. The results indicate that SVR is a suitable method for genome-enabled prediction of reproductive traits in Nellore cattle. Further, the optimal kernel bandwidth parameter in the SVR model was trait-dependent, thus, a fine-tuning for this hyper-parameter in the training phase is crucial.
AB - This study aimed to assess the predictive ability of different machine learning (ML) methods for genomic prediction of reproductive traits in Nellore cattle. The studied traits were age at first calving (AFC), scrotal circumference (SC), early pregnancy (EP) and stayability (STAY). The numbers of genotyped animals and SNP markers available were 2342 and 321 419 (AFC), 4671 and 309 486 (SC), 2681 and 319 619 (STAY) and 3356 and 319 108 (EP). Predictive ability of support vector regression (SVR), Bayesian regularized artificial neural network (BRANN) and random forest (RF) were compared with results obtained using parametric models (genomic best linear unbiased predictor, GBLUP, and Bayesian least absolute shrinkage and selection operator, BLASSO). A 5-fold cross-validation strategy was performed and the average prediction accuracy (ACC) and mean squared errors (MSE) were computed. The ACC was defined as the linear correlation between predicted and observed breeding values for categorical traits (EP and STAY) and as the correlation between predicted and observed adjusted phenotypes divided by the square root of the estimated heritability for continuous traits (AFC and SC). The average ACC varied from low to moderate depending on the trait and model under consideration, ranging between 0.56 and 0.63 (AFC), 0.27 and 0.36 (SC), 0.57 and 0.67 (EP), and 0.52 and 0.62 (STAY). SVR provided slightly better accuracies than the parametric models for all traits, increasing the prediction accuracy for AFC to around 6.3 and 4.8% compared with GBLUP and BLASSO respectively. Likewise, there was an increase of 8.3% for SC, 4.5% for EP and 4.8% for STAY, comparing SVR with both GBLUP and BLASSO. In contrast, the RF and BRANN did not present competitive predictive ability compared with the parametric models. The results indicate that SVR is a suitable method for genome-enabled prediction of reproductive traits in Nellore cattle. Further, the optimal kernel bandwidth parameter in the SVR model was trait-dependent, thus, a fine-tuning for this hyper-parameter in the training phase is crucial.
KW - artificial neural network
KW - fertility traits
KW - genomic selection
KW - random forest
KW - support vector regression
UR - http://www.scopus.com/inward/record.url?scp=85096656152&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096656152&partnerID=8YFLogxK
U2 - 10.1111/age.13021
DO - 10.1111/age.13021
M3 - Article
C2 - 33191532
AN - SCOPUS:85096656152
SN - 0268-9146
VL - 52
SP - 32
EP - 46
JO - Animal genetics
JF - Animal genetics
IS - 1
ER -