Triangulating natural language processing (NLP)-based analysis of rater comments and many-facet Rasch measurement (MFRM): An innovative approach to investigating raters’ application of rating scales in writing assessment

Huiying Cai, Xun Yan

Research output: Contribution to journalArticlepeer-review

Abstract

Rater comments tend to be qualitatively analyzed to indicate raters’ application of rating scales. This study applied natural language processing (NLP) techniques to quantify meaningful, behavioral information from a corpus of rater comments and triangulated that information with a many-facet Rasch measurement (MFRM) analysis of rater scores. The data consisted of ratings on 987 essays by 36 raters (a total of 3948 analytic scores and 1974 rater comments) on a post-admission English Placement Test (EPT) at a large US university. We computed a set of comment-based features based on the analytic components and evaluative language the raters used to infer whether raters were aligned to the scale. For data triangulation, we performed correlation analyses between the MFRM measures of rater performance and the comment-based measures. Although the EPT raters showed overall satisfactory performance, we found meaningful associations between rater comments and performance features. In particular, raters with higher precision and fit to what the Rasch model predicts used more analytic components and used evaluative language more similar to the scale descriptors. These findings suggest that NLP techniques have the potential to help language testers analyze rater comments and understand rater behavior.

Original languageEnglish (US)
Pages (from-to)384-411
Number of pages28
JournalLanguage Testing
Volume41
Issue number2
DOIs
StatePublished - Apr 2024

Keywords

  • Many-facet Rasch measurement
  • natural language processing
  • rater comments
  • rater performance
  • rating scale

ASJC Scopus subject areas

  • Language and Linguistics
  • Social Sciences (miscellaneous)
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Triangulating natural language processing (NLP)-based analysis of rater comments and many-facet Rasch measurement (MFRM): An innovative approach to investigating raters’ application of rating scales in writing assessment'. Together they form a unique fingerprint.

Cite this