The New Standards Project conducted a pilot test of a series of performance-based assessment tasks in mathematics and English language arts at Grades 4 and 8 in the spring of 1993. This article reports the results of a series of generalizability analyses conducted for a subset of the 1993 pilot study data in mathematics. Generalizability analyses for completely crossed designs of Raters × Tasks × Pupils were conducted for a total of nine collections of mathematics tasks. The results of those analyses were used to estimate standard errors of measurement for absolute decision studies using various combinations of number of raters and number of tasks. Consistent with results of previous analyses of performance-based assessment tasks, sampling variability due to tasks was found to be substantially larger than that due to raters. Implications for assessment designs are discussed.
ASJC Scopus subject areas
- Developmental and Educational Psychology