TY - JOUR
T1 - An Investigation of Different Treatment Strategies for Item Category Collapsing in Calibration
T2 - An Empirical Study
AU - Tay-lim, Brenda Siok Hoon
AU - Zhang, Jinming
N1 - Publisher Copyright:
© , Copyright © Taylor & Francis Group, LLC.
PY - 2015/4/3
Y1 - 2015/4/3
N2 - To ensure the statistical result validity, model-data fit must be evaluated for each item. In practice, certain actions or treatments are needed for misfit items. If all misfit items are treated, much item information would be lost during calibration. On the other hand, if only severely misfit items are treated, the inclusion of misfit items may invalidate the statistical inferences based on the estimated item response models. Hence, given response data, one has to find a balance between treating too few and too many misfit items. In this article, misfit items are classified into three categories based on the extent of misfit. Accordingly, three different item treatment strategies are proposed in determining which categories of misfit items should be treated. The impact of using different strategies is investigated. The results show that the test information functions obtained under different strategies can be substantially different in some ability ranges.
AB - To ensure the statistical result validity, model-data fit must be evaluated for each item. In practice, certain actions or treatments are needed for misfit items. If all misfit items are treated, much item information would be lost during calibration. On the other hand, if only severely misfit items are treated, the inclusion of misfit items may invalidate the statistical inferences based on the estimated item response models. Hence, given response data, one has to find a balance between treating too few and too many misfit items. In this article, misfit items are classified into three categories based on the extent of misfit. Accordingly, three different item treatment strategies are proposed in determining which categories of misfit items should be treated. The impact of using different strategies is investigated. The results show that the test information functions obtained under different strategies can be substantially different in some ability ranges.
UR - http://www.scopus.com/inward/record.url?scp=84926325461&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84926325461&partnerID=8YFLogxK
U2 - 10.1080/08957347.2014.1002917
DO - 10.1080/08957347.2014.1002917
M3 - Article
AN - SCOPUS:84926325461
VL - 28
SP - 143
EP - 155
JO - Applied Measurement in Education
JF - Applied Measurement in Education
SN - 0895-7347
IS - 2
ER -