Abstract
This study employed a mixed-methods approach to examine how rater performance develops during a semester-long rater certification program for an English as a Second Language (ESL) writing placement test at a large US university. From 2016 to 2018, we tracked three groups of novice raters (n = 30) across four rounds in the certification program. Using many-facet Rasch modeling, rater performance was examined in terms of rater agreement, rater consistency, and rater severity. These measurement estimates of rating quality were subjected to multivariate analysis to examine whether and how rater performance changes across rounds. Rater comments on the essays were qualitatively analyzed to obtain a deeper understanding of how raters learn to use the scale over time. The quantitative results showed a non-linear, three-staged developmental pattern of rater performance for all three groups of raters. Findings of this study suggest that rater development resembles a learning curve similar to how one acquires a language and other skills. We argue that understanding the developmental pattern of rater behavior is crucial not only to understanding the effectiveness of rater training, but also to the investigation of rater cognition and development. We will also discuss the practical implications of this study in relation to the effort and expectations needed for rater training for writing assessments.
Original language | English (US) |
---|---|
Pages (from-to) | 153-179 |
Number of pages | 27 |
Journal | Language Testing |
Volume | 40 |
Issue number | 1 |
DOIs | |
State | Published - Jan 2023 |
Keywords
- Longitudinal development
- many-facets Rasch measurement
- rater cognition
- rater reliability
- u-shaped learning curve
ASJC Scopus subject areas
- Language and Linguistics
- Social Sciences (miscellaneous)
- Linguistics and Language