TY - GEN
T1 - Grading-based test suite augmentation
AU - Osei-Owusu, Jonathan
AU - Astorga, Angello
AU - Butler, Liia
AU - Xie, Tao
AU - Challen, Geoffrey
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/11
Y1 - 2019/11
N2 - Enrollment in introductory programming (CS1) courses continues to surge and hundreds of CS1 students can produce thousands of submissions for a single problem, all requiring timely and accurate grading. One way that instructors can efficiently grade is to construct a custom instructor test suite that compares a student submission to a reference solution over randomly generated or hand-crafted inputs. However, such test suite is often insufficient, causing incorrect submissions to be marked as correct. To address this issue, we propose the Grasa (GRAding-based test Suite Augmentation) approach consisting of two techniques. Grasa first detects and clusters incorrect submissions by approximating their behavioral equivalence to each other. To augment the existing instructor test suite, Grasa generates a minimal set of additional tests that help detect the incorrect submissions. We evaluate our Grasa approach on a dataset of CS1 student submissions for three programming problems. Our preliminary results show that Grasa can effectively identify incorrect student submissions and minimally augment the instructor test suite.
AB - Enrollment in introductory programming (CS1) courses continues to surge and hundreds of CS1 students can produce thousands of submissions for a single problem, all requiring timely and accurate grading. One way that instructors can efficiently grade is to construct a custom instructor test suite that compares a student submission to a reference solution over randomly generated or hand-crafted inputs. However, such test suite is often insufficient, causing incorrect submissions to be marked as correct. To address this issue, we propose the Grasa (GRAding-based test Suite Augmentation) approach consisting of two techniques. Grasa first detects and clusters incorrect submissions by approximating their behavioral equivalence to each other. To augment the existing instructor test suite, Grasa generates a minimal set of additional tests that help detect the incorrect submissions. We evaluate our Grasa approach on a dataset of CS1 student submissions for three programming problems. Our preliminary results show that Grasa can effectively identify incorrect student submissions and minimally augment the instructor test suite.
KW - Clustering
KW - Programming education
KW - Testing
UR - http://www.scopus.com/inward/record.url?scp=85078875342&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85078875342&partnerID=8YFLogxK
U2 - 10.1109/ASE.2019.00030
DO - 10.1109/ASE.2019.00030
M3 - Conference contribution
AN - SCOPUS:85078875342
T3 - Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
SP - 226
EP - 229
BT - Proceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
Y2 - 10 November 2019 through 15 November 2019
ER -