The Reliability and Precision of Total Scores and IRT Estimates as a Function of Polytomous IRT Parameters and Latent Trait Distribution

Research output: Contribution to journalArticlepeer-review


A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and precision of scores within the CTT and IRT frameworks. This study presented new results pertaining to the relative precision (i.e., the test score conditional standard error of measurement for a given trait value) of CTT and IRT, and the new results shed light on the conditions where total scores and IRT estimates are more or less precisely measured. The relative reliability of CTT and IRT scores is examined as a function of item characteristics (e.g., locations, category thresholds, and discriminations) and subject characteristics (e.g., the skewness and kurtosis of the latent distribution). CTT total scores were more reliable when the latent distribution was mismatched with category thresholds, but the discrepancy between CTT and IRT declined as the number of scale categories increased. This article also considered the appropriateness of linear approximations of polytomous items and presented circumstances where linear approximations are viable. A linear approximation may be appropriate for items with two response options depending on the item discrimination and the match between the item location and latent distribution. However, linear approximations are biased whenever items are located in the tails of the latent distribution and the bias is larger for more discriminating items.

Original languageEnglish (US)
Pages (from-to)201-225
Number of pages25
JournalApplied Psychological Measurement
Issue number3
StatePublished - May 1 2013


  • classical test theory
  • information function
  • item response theory
  • polytomous items
  • reliability
  • scale construction

ASJC Scopus subject areas

  • Psychology (miscellaneous)
  • Social Sciences (miscellaneous)

Fingerprint Dive into the research topics of 'The Reliability and Precision of Total Scores and IRT Estimates as a Function of Polytomous IRT Parameters and Latent Trait Distribution'. Together they form a unique fingerprint.

Cite this