Choice of Test Model for Appropriateness Measurement

Research output: Contribution to journalArticle

Abstract

Several theoretical and empirical issues that must be addressed before appropriateness measurement can be used by practitioners are investigated in this paper. These issues include selection of a latent trait model for multiple-choice tests, selection of a particular appropriateness index, and the sample size required for parameter estimation. The three-parameter logistic model is found to provide better detection of simulated spuriously low examinees than the Rasch model for the Graduate Record Examination, Verbal Section. All three appropriateness indices proposed by Levine and Rubin (1979) provide good detection of simulated spuriously low examinees but poor detection of simulated spuriously high examinees. A reason for this discrepancy is provided.

Original languageEnglish (US)
Pages (from-to)297-308
Number of pages12
JournalApplied Psychological Measurement
Volume6
Issue number3
DOIs
StatePublished - Jan 1 1982

Fingerprint

Educational Measurement
Sample Size
Logistic Models
logistics
graduate
examination

ASJC Scopus subject areas

  • Social Sciences (miscellaneous)
  • Psychology (miscellaneous)

Cite this

Choice of Test Model for Appropriateness Measurement. / Drasgow, Fritz.

In: Applied Psychological Measurement, Vol. 6, No. 3, 01.01.1982, p. 297-308.

Research output: Contribution to journalArticle

@article{48d802ba2d3e4e8ba4980bb34c7e1086,
title = "Choice of Test Model for Appropriateness Measurement",
abstract = "Several theoretical and empirical issues that must be addressed before appropriateness measurement can be used by practitioners are investigated in this paper. These issues include selection of a latent trait model for multiple-choice tests, selection of a particular appropriateness index, and the sample size required for parameter estimation. The three-parameter logistic model is found to provide better detection of simulated spuriously low examinees than the Rasch model for the Graduate Record Examination, Verbal Section. All three appropriateness indices proposed by Levine and Rubin (1979) provide good detection of simulated spuriously low examinees but poor detection of simulated spuriously high examinees. A reason for this discrepancy is provided.",
author = "Fritz Drasgow",
year = "1982",
month = "1",
day = "1",
doi = "10.1177/014662168200600307",
language = "English (US)",
volume = "6",
pages = "297--308",
journal = "Applied Psychological Measurement",
issn = "0146-6216",
publisher = "SAGE Publications Inc.",
number = "3",

}

TY - JOUR

T1 - Choice of Test Model for Appropriateness Measurement

AU - Drasgow, Fritz

PY - 1982/1/1

Y1 - 1982/1/1

N2 - Several theoretical and empirical issues that must be addressed before appropriateness measurement can be used by practitioners are investigated in this paper. These issues include selection of a latent trait model for multiple-choice tests, selection of a particular appropriateness index, and the sample size required for parameter estimation. The three-parameter logistic model is found to provide better detection of simulated spuriously low examinees than the Rasch model for the Graduate Record Examination, Verbal Section. All three appropriateness indices proposed by Levine and Rubin (1979) provide good detection of simulated spuriously low examinees but poor detection of simulated spuriously high examinees. A reason for this discrepancy is provided.

AB - Several theoretical and empirical issues that must be addressed before appropriateness measurement can be used by practitioners are investigated in this paper. These issues include selection of a latent trait model for multiple-choice tests, selection of a particular appropriateness index, and the sample size required for parameter estimation. The three-parameter logistic model is found to provide better detection of simulated spuriously low examinees than the Rasch model for the Graduate Record Examination, Verbal Section. All three appropriateness indices proposed by Levine and Rubin (1979) provide good detection of simulated spuriously low examinees but poor detection of simulated spuriously high examinees. A reason for this discrepancy is provided.

UR - http://www.scopus.com/inward/record.url?scp=84970306258&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84970306258&partnerID=8YFLogxK

U2 - 10.1177/014662168200600307

DO - 10.1177/014662168200600307

M3 - Article

VL - 6

SP - 297

EP - 308

JO - Applied Psychological Measurement

JF - Applied Psychological Measurement

SN - 0146-6216

IS - 3

ER -