This report examines the consequences of differential item functioning (DIF) using simulated data. Its impact on total score, item response theory (IRT) ability estimate, and test reliability was evaluated in various testing scenarios created by manipulating the following four factors: test length, percentage of DIF items per form, sample sizes of different examinee groups, and types of responses. The results indicate that the greatest score difference was observed between the examinee groups on forms with DIF items, and the magnitude was less than 2 points on the 0‐60 total score scale and .15 on the IRT ability scale. The influence on reliability was rather limited.
|Original language||English (US)|
|Journal||ETS Research Report Series|
|State||Published - Jun 1 2010|
- differential item functioning
- total score
- expected a posteriori