TY - JOUR
T1 - Using a Machine Learning Methodology to Analyze Reddit Posts regarding Child Feeding Information
AU - Donelson, Curtis
AU - Sutter, Carolyn
AU - Pham, Giang V.
AU - Narang, Kanika
AU - Wang, Chen
AU - Yun, Joseph T.
N1 - Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC part of Springer Nature.
PY - 2021/5
Y1 - 2021/5
N2 - The current research used human-coded Reddit posts categorized by already established food parenting concepts (coercive control, structure, autonomy support, recipes) as a basis for machine learning models, with the objective of providing insight into topics related to feeding children discussed on social media and to provide a way for future research to use our trained machine-learned model. Reddit posts from specific, parenting-related subreddits were collected and labeled as they related to aspects of child-feeding behavior. Posts were then put through text pre-processing, converted into TF-IDF vectors, and used to train support vector machine binary and multiclass classification models. Other classifiers and text-preprocessing steps were also tested. After training, the binary model was able to classify posts with 86.1% accuracy as being about child feeding or not, up from a baseline accuracy of 57.6%. The multiclass model yielded a 79.1% accuracy to classify posts related to four categories of child feeding concepts (coercive control, autonomy support, structure, recipes), up from a baseline of 51.9%. The comparison models were found to perform less favorably. The best performing binary model is publicly available for use via the Social Media Macroscope and we provide details on how to use this model. Information is presented such that other researchers and professionals interested in examining issues related to feeding children posted on social media could effectively utilize the same approach.
AB - The current research used human-coded Reddit posts categorized by already established food parenting concepts (coercive control, structure, autonomy support, recipes) as a basis for machine learning models, with the objective of providing insight into topics related to feeding children discussed on social media and to provide a way for future research to use our trained machine-learned model. Reddit posts from specific, parenting-related subreddits were collected and labeled as they related to aspects of child-feeding behavior. Posts were then put through text pre-processing, converted into TF-IDF vectors, and used to train support vector machine binary and multiclass classification models. Other classifiers and text-preprocessing steps were also tested. After training, the binary model was able to classify posts with 86.1% accuracy as being about child feeding or not, up from a baseline accuracy of 57.6%. The multiclass model yielded a 79.1% accuracy to classify posts related to four categories of child feeding concepts (coercive control, autonomy support, structure, recipes), up from a baseline of 51.9%. The comparison models were found to perform less favorably. The best performing binary model is publicly available for use via the Social Media Macroscope and we provide details on how to use this model. Information is presented such that other researchers and professionals interested in examining issues related to feeding children posted on social media could effectively utilize the same approach.
KW - Computational methods
KW - Feeding
KW - Machine learning
KW - Parenting
KW - Social media
UR - http://www.scopus.com/inward/record.url?scp=85101697894&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85101697894&partnerID=8YFLogxK
U2 - 10.1007/s10826-021-01923-5
DO - 10.1007/s10826-021-01923-5
M3 - Article
AN - SCOPUS:85101697894
SN - 1062-1024
VL - 30
SP - 1290
EP - 1298
JO - Journal of Child and Family Studies
JF - Journal of Child and Family Studies
IS - 5
ER -