TY - JOUR
T1 - IRT Item Parameter Scaling for Developing New Item Pools
AU - Kang, Hyeon Ah
AU - Lu, Ying
AU - Chang, Hua Hua
N1 - Publisher Copyright:
© 2017 Taylor & Francis.
PY - 2017/1/2
Y1 - 2017/1/2
N2 - Increasing use of item pools in large-scale educational assessments calls for an appropriate scaling procedure to achieve a common metric among field-tested items. The present study examines scaling procedures for developing a new item pool under a spiraled block linking design. The three scaling procedures are considered: (a) concurrent calibration, (b) separate calibration with one linking, and (c) separate calibration with three sequential linking. Evaluation across varying sample sizes and item pool sizes suggests that calibrating an item pool simultaneously results in the most stable scaling. The separate calibration with linking procedures produced larger scaling errors as the number of linking steps increased. The Haebara’s item characteristic curve linking resulted in better performances than the test characteristic curve (TCC) linking method. The present article provides an analytic illustration that the test characteristic curve method may fail to find global solutions in polytomous items. Finally, comparison of the single- and mixed-format item pools suggests that the use of polytomous items as the anchor can improve the overall scaling accuracy of the item pools.
AB - Increasing use of item pools in large-scale educational assessments calls for an appropriate scaling procedure to achieve a common metric among field-tested items. The present study examines scaling procedures for developing a new item pool under a spiraled block linking design. The three scaling procedures are considered: (a) concurrent calibration, (b) separate calibration with one linking, and (c) separate calibration with three sequential linking. Evaluation across varying sample sizes and item pool sizes suggests that calibrating an item pool simultaneously results in the most stable scaling. The separate calibration with linking procedures produced larger scaling errors as the number of linking steps increased. The Haebara’s item characteristic curve linking resulted in better performances than the test characteristic curve (TCC) linking method. The present article provides an analytic illustration that the test characteristic curve method may fail to find global solutions in polytomous items. Finally, comparison of the single- and mixed-format item pools suggests that the use of polytomous items as the anchor can improve the overall scaling accuracy of the item pools.
UR - http://www.scopus.com/inward/record.url?scp=85003443607&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85003443607&partnerID=8YFLogxK
U2 - 10.1080/08957347.2016.1243537
DO - 10.1080/08957347.2016.1243537
M3 - Article
AN - SCOPUS:85003443607
SN - 0895-7347
VL - 30
SP - 1
EP - 15
JO - Applied Measurement in Education
JF - Applied Measurement in Education
IS - 1
ER -