Genome-enabled prediction of reproductive traits in Nellore cattle using parametric models and machine learning methods

A. A.C. Alves, R. Espigolan, T. Bresolin, R. M. Costa, G. A. Fernandes Júnior, R. V. Ventura, R. Carvalheiro, L. G. Albuquerque

Research output: Contribution to journalArticlepeer-review

Abstract

This study aimed to assess the predictive ability of different machine learning (ML) methods for genomic prediction of reproductive traits in Nellore cattle. The studied traits were age at first calving (AFC), scrotal circumference (SC), early pregnancy (EP) and stayability (STAY). The numbers of genotyped animals and SNP markers available were 2342 and 321 419 (AFC), 4671 and 309 486 (SC), 2681 and 319 619 (STAY) and 3356 and 319 108 (EP). Predictive ability of support vector regression (SVR), Bayesian regularized artificial neural network (BRANN) and random forest (RF) were compared with results obtained using parametric models (genomic best linear unbiased predictor, GBLUP, and Bayesian least absolute shrinkage and selection operator, BLASSO). A 5-fold cross-validation strategy was performed and the average prediction accuracy (ACC) and mean squared errors (MSE) were computed. The ACC was defined as the linear correlation between predicted and observed breeding values for categorical traits (EP and STAY) and as the correlation between predicted and observed adjusted phenotypes divided by the square root of the estimated heritability for continuous traits (AFC and SC). The average ACC varied from low to moderate depending on the trait and model under consideration, ranging between 0.56 and 0.63 (AFC), 0.27 and 0.36 (SC), 0.57 and 0.67 (EP), and 0.52 and 0.62 (STAY). SVR provided slightly better accuracies than the parametric models for all traits, increasing the prediction accuracy for AFC to around 6.3 and 4.8% compared with GBLUP and BLASSO respectively. Likewise, there was an increase of 8.3% for SC, 4.5% for EP and 4.8% for STAY, comparing SVR with both GBLUP and BLASSO. In contrast, the RF and BRANN did not present competitive predictive ability compared with the parametric models. The results indicate that SVR is a suitable method for genome-enabled prediction of reproductive traits in Nellore cattle. Further, the optimal kernel bandwidth parameter in the SVR model was trait-dependent, thus, a fine-tuning for this hyper-parameter in the training phase is crucial.

Original languageEnglish (US)
Pages (from-to)32-46
Number of pages15
JournalAnimal genetics
Volume52
Issue number1
DOIs
StatePublished - Feb 2021
Externally publishedYes

Keywords

  • artificial neural network
  • fertility traits
  • genomic selection
  • random forest
  • support vector regression

ASJC Scopus subject areas

  • Animal Science and Zoology
  • Genetics

Fingerprint

Dive into the research topics of 'Genome-enabled prediction of reproductive traits in Nellore cattle using parametric models and machine learning methods'. Together they form a unique fingerprint.

Cite this