TY - JOUR
T1 - Predicting Cognitive Outcome Through Nutrition and Health Markers Using Supervised Machine Learning
AU - Verma, Shreya
AU - Holthaus, Tori A.
AU - Martell, Shelby
AU - Holscher, Hannah D.
AU - Zhu, Ruoqing
AU - Khan, Naiman A.
N1 - The authors\u2019 responsibilities were as follows \u2013 SV: performed analysis, wrote the original draft, and visualized and conceived the study; SV, RZ, NAK: were responsible for funding acquisition. TH, SM, HDH, RZ, NAK, reviewed and edited the manuscript. HDH, RZ, NAK: were responsible for resources. NAK: supervised, investigated, and conceptualized the study; and all authors: read and approved the final manuscript.This research was supported by the Personalized Nutrition Initiative, the National Center for Supercomputing Applications, and the Division of Nutritional Sciences at the University of Illinois Urbana-Champaign, Urbana, IL, United States and the Hass Avocado Board.
PY - 2025
Y1 - 2025
N2 - Background: Machine learning (ML) use in health research is growing, yet its application to predict cognitive outcomes using diverse health indicators is underinvestigated. Objectives: We used ML models to predict cognitive performance based on a set of health and behavioral factors, aiming to identify key contributors to cognitive function for insights into potential personalized interventions. Methods: Data from 374 adults aged 19–82 y (227 females) were used to develop ML models predicting cognitive performance (reaction time in milliseconds) on a modified Eriksen flanker task. Features included demographics, anthropometric measures, dietary indices (Healthy Eating Index, Dietary Approaches to Stop Hypertension, Mediterranean, and Mediterranean–Dietary Approaches to Stop Hypertension Intervention for Neurodegenerative Delay), self-reported physical activity, and systolic and diastolic blood pressures. The data set was split (80:20) for training and testing. Predictive models (decision trees, random forest, AdaBoost, XGBoost, gradient boosting, linear, ridge, and lasso regression) were used with hyperparameter tuning and crossvalidation. Feature importance was calculated using permutation importance, whereas performance using mean absolute error (MAE) and mean squared error. Results: Random forest regressor exhibited the best performance, with the lowest MAE (training: 0.66 ms; testing: 0.78 ms) and mean squared error (training: 0.70 ms2; testing: 1.05 ms2). Age was the most significant feature (score: 0.208), followed by diastolic blood pressure (0.169), BMI (0.079), systolic blood pressure (0.069), and Healthy Eating Index (0.048). Ethnicity (0.005) and sex (0.003) had minimal predictive effect. Conclusions: Age, blood pressure, and BMI show strong associations with cognitive performance, whereas diet quality has a subtler effect. These findings highlight the potential of ML models for developing personalized interventions and preventive strategies for cognitive decline.
AB - Background: Machine learning (ML) use in health research is growing, yet its application to predict cognitive outcomes using diverse health indicators is underinvestigated. Objectives: We used ML models to predict cognitive performance based on a set of health and behavioral factors, aiming to identify key contributors to cognitive function for insights into potential personalized interventions. Methods: Data from 374 adults aged 19–82 y (227 females) were used to develop ML models predicting cognitive performance (reaction time in milliseconds) on a modified Eriksen flanker task. Features included demographics, anthropometric measures, dietary indices (Healthy Eating Index, Dietary Approaches to Stop Hypertension, Mediterranean, and Mediterranean–Dietary Approaches to Stop Hypertension Intervention for Neurodegenerative Delay), self-reported physical activity, and systolic and diastolic blood pressures. The data set was split (80:20) for training and testing. Predictive models (decision trees, random forest, AdaBoost, XGBoost, gradient boosting, linear, ridge, and lasso regression) were used with hyperparameter tuning and crossvalidation. Feature importance was calculated using permutation importance, whereas performance using mean absolute error (MAE) and mean squared error. Results: Random forest regressor exhibited the best performance, with the lowest MAE (training: 0.66 ms; testing: 0.78 ms) and mean squared error (training: 0.70 ms2; testing: 1.05 ms2). Age was the most significant feature (score: 0.208), followed by diastolic blood pressure (0.169), BMI (0.079), systolic blood pressure (0.069), and Healthy Eating Index (0.048). Ethnicity (0.005) and sex (0.003) had minimal predictive effect. Conclusions: Age, blood pressure, and BMI show strong associations with cognitive performance, whereas diet quality has a subtler effect. These findings highlight the potential of ML models for developing personalized interventions and preventive strategies for cognitive decline.
KW - cognitive function
KW - dietary patterns
KW - MIND diet
KW - personalized health
KW - random forest
UR - http://www.scopus.com/inward/record.url?scp=105007454574&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=105007454574&partnerID=8YFLogxK
U2 - 10.1016/j.tjnut.2025.05.003
DO - 10.1016/j.tjnut.2025.05.003
M3 - Article
C2 - 40368299
AN - SCOPUS:105007454574
SN - 0022-3166
JO - Journal of Nutrition
JF - Journal of Nutrition
ER -