TY - JOUR
T1 - Detecting differential item functioning with confirmatory factor analysis and item response theory
T2 - Toward a unified strategy
AU - Stark, Stephen
AU - Chernyshenko, Oleksandr S.
AU - Drasgow, Fritz
PY - 2006/11
Y1 - 2006/11
N2 - In this article, the authors developed a common strategy for identifying differential item functioning (DIF) items that can be implemented in both the mean and covariance structures method (MACS) and item response theory (IRT). They proposed examining the loadings (discrimination) and the intercept (location) parameters simultaneously using the likelihood ratio test with a free-baseline model and Bonferroni corrected critical p values. They compared the relative efficacy of this approach with alternative implementations for various types and amounts of DIF, sample sizes, numbers of response categories, and amounts of impact (latent mean differences). Results indicated that the proposed strategy was considerably more effective than an alternative approach involving a constrained-baseline model. Both MACS and IRT performed similarly well in the majority of experimental conditions. As expected, MACS performed slightly worse in dichotomous conditions but better than IRT in polytomous cases where sample sizes were small. Also, contrary to popular belief, MACS performed well in conditions where DIF was simulated on item thresholds (item means), and its accuracy was not affected by impact. (PsycINFO Database Record (c) 2006 APA, all rights reserved).
AB - In this article, the authors developed a common strategy for identifying differential item functioning (DIF) items that can be implemented in both the mean and covariance structures method (MACS) and item response theory (IRT). They proposed examining the loadings (discrimination) and the intercept (location) parameters simultaneously using the likelihood ratio test with a free-baseline model and Bonferroni corrected critical p values. They compared the relative efficacy of this approach with alternative implementations for various types and amounts of DIF, sample sizes, numbers of response categories, and amounts of impact (latent mean differences). Results indicated that the proposed strategy was considerably more effective than an alternative approach involving a constrained-baseline model. Both MACS and IRT performed similarly well in the majority of experimental conditions. As expected, MACS performed slightly worse in dichotomous conditions but better than IRT in polytomous cases where sample sizes were small. Also, contrary to popular belief, MACS performed well in conditions where DIF was simulated on item thresholds (item means), and its accuracy was not affected by impact. (PsycINFO Database Record (c) 2006 APA, all rights reserved).
KW - CFA
KW - Confirmatory factor analysis
KW - DIF
KW - Differential item functioning
KW - IRT
KW - Item response theory
KW - MACS
KW - Mean and covariance structures
KW - Measurement equivalence
KW - Sample size
KW - Thresholds
UR - http://www.scopus.com/inward/record.url?scp=33750978441&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33750978441&partnerID=8YFLogxK
U2 - 10.1037/0021-9010.91.6.1292
DO - 10.1037/0021-9010.91.6.1292
M3 - Article
C2 - 17100485
AN - SCOPUS:33750978441
SN - 0021-9010
VL - 91
SP - 1292
EP - 1306
JO - Journal of Applied Psychology
JF - Journal of Applied Psychology
IS - 6
ER -