TY - JOUR
T1 - Using computer adaptive testing to assess physics proficiency and improve exam performance in an introductory physics course
AU - Morphew, Jason W.
AU - Mestre, Jose P.
AU - Kang, Hyeon Ah
AU - Chang, Hua Hua
AU - Fabry, Gregory
N1 - Funding Information:
This research was supported in part by the National Science Foundation under Grant No. DRL 1252389. Any opinions, findings, and conclusions expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. We thank Prof. Carolyn Anderson for her suggestions on explaining the psychometric properties of computer adaptive testing.
Publisher Copyright:
© 2018 authors. Published by the American Physical Society. Published by the American Physical Society under the terms of the »https://creativecommons.org/licenses/by/4.0/» Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article's title, journal citation, and DOI.
PY - 2018/9/20
Y1 - 2018/9/20
N2 - Prior research has established that students often underprepare for midterm examinations yet remain overconfident in their proficiency. Research concerning the testing effect has demonstrated that utilizing testing as a study strategy leads to higher performance and more accurate confidence compared to more common study strategies such as rereading or reviewing homework problems. We report on three experiments that explore the viability of using computer adaptive testing (CAT) for assessing students' physics proficiency, for preparing students for midterm exams by diagnosing their weaknesses, and for predicting scores in midterm exams in an introductory calculus-based mechanics course for science and engineering majors. The first two experiments evaluated the reliability and validity of the CAT algorithm. In addition, we investigated the ability of the CAT test to predict performance on the midterm exam. The third experiment explored whether completing two CAT tests in the days before a midterm exam would facilitate performance on the midterm exam. Scores on the CAT tests and the midterm exams were significantly correlated and, on average, were not statistically different from each other. This provides evidence for moderate parallel-forms reliability and criterion-related validity of the CAT algorithm. In addition, when used as a diagnostic tool, CAT showed promise in helping students perform better on midterm exams. Finally, we found that the CAT tests predicted the average performance on the midterm exams reasonably well, however, the CAT tests were not as accurate as desired at predicting the performance of individual students. While CAT shows promise for practice testing, more research is needed to refine testing algorithms to increase reliability before implementing CAT for summative evaluations. In light of these findings, we believe that more research is needed comparing CAT to traditional paper-and-pencil practice tests in order to determine whether the effort needed to create a CAT system is worthwhile.
AB - Prior research has established that students often underprepare for midterm examinations yet remain overconfident in their proficiency. Research concerning the testing effect has demonstrated that utilizing testing as a study strategy leads to higher performance and more accurate confidence compared to more common study strategies such as rereading or reviewing homework problems. We report on three experiments that explore the viability of using computer adaptive testing (CAT) for assessing students' physics proficiency, for preparing students for midterm exams by diagnosing their weaknesses, and for predicting scores in midterm exams in an introductory calculus-based mechanics course for science and engineering majors. The first two experiments evaluated the reliability and validity of the CAT algorithm. In addition, we investigated the ability of the CAT test to predict performance on the midterm exam. The third experiment explored whether completing two CAT tests in the days before a midterm exam would facilitate performance on the midterm exam. Scores on the CAT tests and the midterm exams were significantly correlated and, on average, were not statistically different from each other. This provides evidence for moderate parallel-forms reliability and criterion-related validity of the CAT algorithm. In addition, when used as a diagnostic tool, CAT showed promise in helping students perform better on midterm exams. Finally, we found that the CAT tests predicted the average performance on the midterm exams reasonably well, however, the CAT tests were not as accurate as desired at predicting the performance of individual students. While CAT shows promise for practice testing, more research is needed to refine testing algorithms to increase reliability before implementing CAT for summative evaluations. In light of these findings, we believe that more research is needed comparing CAT to traditional paper-and-pencil practice tests in order to determine whether the effort needed to create a CAT system is worthwhile.
UR - http://www.scopus.com/inward/record.url?scp=85054485828&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85054485828&partnerID=8YFLogxK
U2 - 10.1103/PhysRevPhysEducRes.14.020110
DO - 10.1103/PhysRevPhysEducRes.14.020110
M3 - Article
AN - SCOPUS:85054485828
SN - 2469-9896
VL - 14
JO - Physical Review Physics Education Research
JF - Physical Review Physics Education Research
IS - 2
M1 - 020110
ER -