Bulk Protein and Oil Prediction in Soybeans Using Transmission Raman Spectroscopy

A Comparison of Approaches to Optimize Accuracy

Rajveer Singh, Tomasz P. Wrobel, Prabuddha Mukherjee, Mark Gryka, Matthew Kole, Sandra Harrison, Rohit Bhargava

Research output: Contribution to journalArticle

Abstract

Rapid measurements of protein and oil content are important for a variety of uses, from sorting of soybeans at the point of harvest to feedback during soybean meal production. In this study, our goal is to develop a simple protocol to permit rapid and robust quantitative prediction of soybean constituents using transmission Raman spectroscopy (TRS). To develop this approach, we systematically varied the various elements of the measurement process to provide a diverse test bed. First, we utilized an in-house-built benchtop TRS instrument such that suitable optical configurations could be rapidly deployed and analyzed for experimental data collection for individual soybean grains. Second, we also utilized three different soybean varieties with relatively low (33.97%), medium (36.98%), and high protein (41.23%) contents to test the development process. Third, samples from each variety were prepared using whole bean and three different sample treatments (i.e., ground bean, whole meal, and ground meal). In each case, we modeled the data obtained using partial least squares (PLS) regression and assessed spectral metric-based multiple linear regression (metric-MLR) approaches to build robust prediction models. The metric-MLR models showed lower root mean square errors (RMSEPs), and hence better prediction, compared to corresponding classical PLS regression models for both bulk protein and oil for all treatment types. Comparing different sample preparation approaches, a lower RMSEPs was observed for whole meal treatment and thus the metric-MLR modeling with ground meal treatment was considered to be optimal protocol for bulk protein and oil prediction in soybean, with RMSEP values of 1.15 ± 0.04 (R2= 0.87) and 0.80 ± 0.02 (R2= 0.87) for bulk protein and oil, respectively. These predictions were nearly two- to threefold better (i.e., lower RMSEPs) than the corresponding NIR spectroscopy measurements (i.e., secondary gold standards in grain industry). For content prediction in whole soybean, incorporating physical attributes of individual grains in metric-MLR approach show up to 22% improvement in bulk protein and a relatively mild (up to ∼5%) improvement in bulk oil prediction. The unique combination of metric-MLR modeling approach (which is rare in the field of grain analysis) and sample treatments resulted in improved prediction models; using the physical attributes of individual grains is suggested as a novel measure for improving accuracy in prediction.

Original languageEnglish (US)
Pages (from-to)687-697
Number of pages11
JournalApplied Spectroscopy
Volume73
Issue number6
DOIs
StatePublished - Jun 1 2019

Fingerprint

soybeans
Raman spectroscopy
Oils
oils
proteins
Proteins
regression analysis
Linear regression
predictions
root-mean-square errors
test stands
classifying
Sorting
Mean square error
industries
Spectroscopy
Feedback
preparation

Keywords

  • MLR
  • NIR spectroscopy
  • PLS regression
  • Soybean
  • multiple linear regression
  • near-infrared spectroscopy
  • transmission Raman spectroscopy

ASJC Scopus subject areas

  • Instrumentation
  • Spectroscopy

Cite this

Bulk Protein and Oil Prediction in Soybeans Using Transmission Raman Spectroscopy : A Comparison of Approaches to Optimize Accuracy. / Singh, Rajveer; Wrobel, Tomasz P.; Mukherjee, Prabuddha; Gryka, Mark; Kole, Matthew; Harrison, Sandra; Bhargava, Rohit.

In: Applied Spectroscopy, Vol. 73, No. 6, 01.06.2019, p. 687-697.

Research output: Contribution to journalArticle

Singh, Rajveer ; Wrobel, Tomasz P. ; Mukherjee, Prabuddha ; Gryka, Mark ; Kole, Matthew ; Harrison, Sandra ; Bhargava, Rohit. / Bulk Protein and Oil Prediction in Soybeans Using Transmission Raman Spectroscopy : A Comparison of Approaches to Optimize Accuracy. In: Applied Spectroscopy. 2019 ; Vol. 73, No. 6. pp. 687-697.
@article{19a31243ed994303b919df1156cd1aef,
title = "Bulk Protein and Oil Prediction in Soybeans Using Transmission Raman Spectroscopy: A Comparison of Approaches to Optimize Accuracy",
abstract = "Rapid measurements of protein and oil content are important for a variety of uses, from sorting of soybeans at the point of harvest to feedback during soybean meal production. In this study, our goal is to develop a simple protocol to permit rapid and robust quantitative prediction of soybean constituents using transmission Raman spectroscopy (TRS). To develop this approach, we systematically varied the various elements of the measurement process to provide a diverse test bed. First, we utilized an in-house-built benchtop TRS instrument such that suitable optical configurations could be rapidly deployed and analyzed for experimental data collection for individual soybean grains. Second, we also utilized three different soybean varieties with relatively low (33.97{\%}), medium (36.98{\%}), and high protein (41.23{\%}) contents to test the development process. Third, samples from each variety were prepared using whole bean and three different sample treatments (i.e., ground bean, whole meal, and ground meal). In each case, we modeled the data obtained using partial least squares (PLS) regression and assessed spectral metric-based multiple linear regression (metric-MLR) approaches to build robust prediction models. The metric-MLR models showed lower root mean square errors (RMSEPs), and hence better prediction, compared to corresponding classical PLS regression models for both bulk protein and oil for all treatment types. Comparing different sample preparation approaches, a lower RMSEPs was observed for whole meal treatment and thus the metric-MLR modeling with ground meal treatment was considered to be optimal protocol for bulk protein and oil prediction in soybean, with RMSEP values of 1.15 ± 0.04 (R2= 0.87) and 0.80 ± 0.02 (R2= 0.87) for bulk protein and oil, respectively. These predictions were nearly two- to threefold better (i.e., lower RMSEPs) than the corresponding NIR spectroscopy measurements (i.e., secondary gold standards in grain industry). For content prediction in whole soybean, incorporating physical attributes of individual grains in metric-MLR approach show up to 22{\%} improvement in bulk protein and a relatively mild (up to ∼5{\%}) improvement in bulk oil prediction. The unique combination of metric-MLR modeling approach (which is rare in the field of grain analysis) and sample treatments resulted in improved prediction models; using the physical attributes of individual grains is suggested as a novel measure for improving accuracy in prediction.",
keywords = "MLR, NIR spectroscopy, PLS regression, Soybean, multiple linear regression, near-infrared spectroscopy, transmission Raman spectroscopy",
author = "Rajveer Singh and Wrobel, {Tomasz P.} and Prabuddha Mukherjee and Mark Gryka and Matthew Kole and Sandra Harrison and Rohit Bhargava",
year = "2019",
month = "6",
day = "1",
doi = "10.1177/0003702818815642",
language = "English (US)",
volume = "73",
pages = "687--697",
journal = "Applied Spectroscopy",
issn = "0003-7028",
publisher = "Society for Applied Spectroscopy",
number = "6",

}

TY - JOUR

T1 - Bulk Protein and Oil Prediction in Soybeans Using Transmission Raman Spectroscopy

T2 - A Comparison of Approaches to Optimize Accuracy

AU - Singh, Rajveer

AU - Wrobel, Tomasz P.

AU - Mukherjee, Prabuddha

AU - Gryka, Mark

AU - Kole, Matthew

AU - Harrison, Sandra

AU - Bhargava, Rohit

PY - 2019/6/1

Y1 - 2019/6/1

N2 - Rapid measurements of protein and oil content are important for a variety of uses, from sorting of soybeans at the point of harvest to feedback during soybean meal production. In this study, our goal is to develop a simple protocol to permit rapid and robust quantitative prediction of soybean constituents using transmission Raman spectroscopy (TRS). To develop this approach, we systematically varied the various elements of the measurement process to provide a diverse test bed. First, we utilized an in-house-built benchtop TRS instrument such that suitable optical configurations could be rapidly deployed and analyzed for experimental data collection for individual soybean grains. Second, we also utilized three different soybean varieties with relatively low (33.97%), medium (36.98%), and high protein (41.23%) contents to test the development process. Third, samples from each variety were prepared using whole bean and three different sample treatments (i.e., ground bean, whole meal, and ground meal). In each case, we modeled the data obtained using partial least squares (PLS) regression and assessed spectral metric-based multiple linear regression (metric-MLR) approaches to build robust prediction models. The metric-MLR models showed lower root mean square errors (RMSEPs), and hence better prediction, compared to corresponding classical PLS regression models for both bulk protein and oil for all treatment types. Comparing different sample preparation approaches, a lower RMSEPs was observed for whole meal treatment and thus the metric-MLR modeling with ground meal treatment was considered to be optimal protocol for bulk protein and oil prediction in soybean, with RMSEP values of 1.15 ± 0.04 (R2= 0.87) and 0.80 ± 0.02 (R2= 0.87) for bulk protein and oil, respectively. These predictions were nearly two- to threefold better (i.e., lower RMSEPs) than the corresponding NIR spectroscopy measurements (i.e., secondary gold standards in grain industry). For content prediction in whole soybean, incorporating physical attributes of individual grains in metric-MLR approach show up to 22% improvement in bulk protein and a relatively mild (up to ∼5%) improvement in bulk oil prediction. The unique combination of metric-MLR modeling approach (which is rare in the field of grain analysis) and sample treatments resulted in improved prediction models; using the physical attributes of individual grains is suggested as a novel measure for improving accuracy in prediction.

AB - Rapid measurements of protein and oil content are important for a variety of uses, from sorting of soybeans at the point of harvest to feedback during soybean meal production. In this study, our goal is to develop a simple protocol to permit rapid and robust quantitative prediction of soybean constituents using transmission Raman spectroscopy (TRS). To develop this approach, we systematically varied the various elements of the measurement process to provide a diverse test bed. First, we utilized an in-house-built benchtop TRS instrument such that suitable optical configurations could be rapidly deployed and analyzed for experimental data collection for individual soybean grains. Second, we also utilized three different soybean varieties with relatively low (33.97%), medium (36.98%), and high protein (41.23%) contents to test the development process. Third, samples from each variety were prepared using whole bean and three different sample treatments (i.e., ground bean, whole meal, and ground meal). In each case, we modeled the data obtained using partial least squares (PLS) regression and assessed spectral metric-based multiple linear regression (metric-MLR) approaches to build robust prediction models. The metric-MLR models showed lower root mean square errors (RMSEPs), and hence better prediction, compared to corresponding classical PLS regression models for both bulk protein and oil for all treatment types. Comparing different sample preparation approaches, a lower RMSEPs was observed for whole meal treatment and thus the metric-MLR modeling with ground meal treatment was considered to be optimal protocol for bulk protein and oil prediction in soybean, with RMSEP values of 1.15 ± 0.04 (R2= 0.87) and 0.80 ± 0.02 (R2= 0.87) for bulk protein and oil, respectively. These predictions were nearly two- to threefold better (i.e., lower RMSEPs) than the corresponding NIR spectroscopy measurements (i.e., secondary gold standards in grain industry). For content prediction in whole soybean, incorporating physical attributes of individual grains in metric-MLR approach show up to 22% improvement in bulk protein and a relatively mild (up to ∼5%) improvement in bulk oil prediction. The unique combination of metric-MLR modeling approach (which is rare in the field of grain analysis) and sample treatments resulted in improved prediction models; using the physical attributes of individual grains is suggested as a novel measure for improving accuracy in prediction.

KW - MLR

KW - NIR spectroscopy

KW - PLS regression

KW - Soybean

KW - multiple linear regression

KW - near-infrared spectroscopy

KW - transmission Raman spectroscopy

UR - http://www.scopus.com/inward/record.url?scp=85061051664&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85061051664&partnerID=8YFLogxK

U2 - 10.1177/0003702818815642

DO - 10.1177/0003702818815642

M3 - Article

VL - 73

SP - 687

EP - 697

JO - Applied Spectroscopy

JF - Applied Spectroscopy

SN - 0003-7028

IS - 6

ER -