TY - JOUR
T1 - AutoML Feature Engineering for Student Modeling Yields High Accuracy, but Limited Interpretability
AU - Bosch, Nigel
N1 - Publisher Copyright:
© 2021. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
PY - 2021
Y1 - 2021
N2 - Automatic machine learning (AutoML) methods automate the time-consuming, feature-engineering process so that researchers produce accurate student models more quickly and easily. In this paper, we compare two AutoML feature engineering methods in the context of the National Assessment of Educational Progress (NAEP) data mining competition. The methods we compare, Featuretools and TSFRESH (Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests), have rarely been applied in the context of student interaction log data. Thus, we address research questions regarding the accuracy of models built with AutoML features, how AutoML feature types compare to each other and to expert-engineered features, and how interpretable the features are. Additionally, we developed a novel feature selection method that addresses problems applying AutoML feature engineering in this context, where there were many heterogeneous features (over 4,000) and relatively few students. Our entry to the NAEP competition placed 3rd overall on the final held-out dataset and 1st on the public leaderboard, with a final Cohen’s kappa = .212 and area under the receiver operating characteristic curve (AUC) = .665 when predicting whether students would manage their time effectively on a math assessment. We found that TSFRESH features were significantly more effective than either Featuretools features or expert-engineered features in this context; however, they were also among the most difficult features to interpret based on a survey of six experts’ judgments. Finally, we discuss the tradeoffs between effort and interpretability that arise in AutoML-based student modeling.
AB - Automatic machine learning (AutoML) methods automate the time-consuming, feature-engineering process so that researchers produce accurate student models more quickly and easily. In this paper, we compare two AutoML feature engineering methods in the context of the National Assessment of Educational Progress (NAEP) data mining competition. The methods we compare, Featuretools and TSFRESH (Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests), have rarely been applied in the context of student interaction log data. Thus, we address research questions regarding the accuracy of models built with AutoML features, how AutoML feature types compare to each other and to expert-engineered features, and how interpretable the features are. Additionally, we developed a novel feature selection method that addresses problems applying AutoML feature engineering in this context, where there were many heterogeneous features (over 4,000) and relatively few students. Our entry to the NAEP competition placed 3rd overall on the final held-out dataset and 1st on the public leaderboard, with a final Cohen’s kappa = .212 and area under the receiver operating characteristic curve (AUC) = .665 when predicting whether students would manage their time effectively on a math assessment. We found that TSFRESH features were significantly more effective than either Featuretools features or expert-engineered features in this context; however, they were also among the most difficult features to interpret based on a survey of six experts’ judgments. Finally, we discuss the tradeoffs between effort and interpretability that arise in AutoML-based student modeling.
KW - AutoML
KW - feature engineering
KW - feature selection
KW - student modeling
UR - http://www.scopus.com/inward/record.url?scp=85132624486&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85132624486&partnerID=8YFLogxK
U2 - 10.5281/zenodo.5275314
DO - 10.5281/zenodo.5275314
M3 - Article
AN - SCOPUS:85132624486
SN - 2157-2100
VL - 13
SP - 55
EP - 79
JO - Journal of Educational Data Mining
JF - Journal of Educational Data Mining
IS - 2
ER -