Measuring the Difference Between Two Models

Michael V. Levine, Fritz Drasgow, Bruce Williams, Christopher Mccusker, Gary L. Thomasson

Research output: Contribution to journalArticlepeer-review

Abstract

Two psychometric models with very different parametric formulas and item response functions can make virtually the same predictions in all applications. By applying some basic results from the theory of hypothesis testing and from signal detection theory, the power of the most powerful test for distinguishing the models can be com puted. Measuring model misspecification by com puting the power of the most powerful test is proposed. If the power of the most powerful test is low, then the two models will make nearly the same prediction in every application. If the power is high, there will be applications in which the models will make different predictions. This measure, that is, the power of the most powerful test, places various types of model misspecifica tion— item parameter estimation error, multidi mensionality, local independence failure, learning and/or fatigue during testing—on a common scale. The theory supporting the method is presented and illustrated with a systematic study of misspecifica tion due to item response function estimation error. In these studies, two joint maximum likelihood estimation methods (LOGIST 2B and LOGIST 5) and two marginal maximum likelihood estimation methods (BILOG and ForScore) were contrasted by measuring the difference between a simulation model and a model obtained by applying an estimation method to simulation data. Marginal estimation was found generally to be superior to joint estimation. The parametric marginal method (BILOG) was superior to the nonparametric method only for three- parameter logistic models. The nonparametric mar ginal method (ForScore) excelled for more general models. Of the two joint maximum likelihood methods studied, LOGIST s appeared to be more accurate than LOGIST 2B.

Original languageEnglish (US)
Pages (from-to)261-278
Number of pages18
JournalApplied Psychological Measurement
Volume16
Issue number3
DOIs
StatePublished - Sep 1992

Keywords

  • BILOG
  • ForScore
  • LOGIST
  • estimation
  • forced-choice experiment
  • ideal observer method
  • item response theory
  • models
  • multilinear formula score theory

ASJC Scopus subject areas

  • Social Sciences (miscellaneous)
  • Psychology (miscellaneous)

Fingerprint

Dive into the research topics of 'Measuring the Difference Between Two Models'. Together they form a unique fingerprint.

Cite this