Connecting Corpus Linguistics and Assessment

Geoffrey LaFlair, Shelley Staples, Xun Yan

Research output: Chapter in Book/Report/Conference proceedingChapter


There is a growing interest both in the use of corpus linguistic methods in language assessment research and in the use of assessment in corpus linguistic research. To explore the triangulation of methods in these two research areas, this chapter has three main goals: (i) to survey the ways in which corpus linguistics and assessment research can complement each other; (ii) to demonstrate a case study where corpus linguistics is used to draw inferences to support three stages of assessment validation: scoring, generalization, and extrapolation; and (iii) to reflect on the benefits and limitations of triangulating corpus linguistics and assessment methods.

For the case study, we compiled a corpus of responses to the independent writing task on Michigan Language Assessment’s Examination for the Certificate of Proficiency in English (ECPE). We used a popular corpus-linguistic method, multi-dimensional (MD) analysis, to compare representative linguistic features of writing performances across proficiency levels and prompts. In addition, the language of the responses is compared to a reference corpus composed of first-year writing samples from composition courses. The case study demonstrates that triangulating register-based corpus linguistics and assessment research has the potential for informing both areas.

For language assessment, inferences at different validation stages can be empirically drawn from corpus studies. For scoring inferences, corpus analysis can help identify co-occurrence patterns of linguistic features that characterize different levels of performance. Regarding generalizability inferences, corpus analysis enables comparison of prompts across test forms, to examine whether different prompts elicit similar linguistic features at the same level. For extrapolation inferences, register-based language analyses can investigate relationships between linguistic features in test performances and linguistic features in the target domain.

For corpus linguistics, there is a potential to make positive changes in how reference corpora are specified and created. If we want reference corpora to represent domains of language use that test takers will be participating in during their studies or professional lives, there is a greater need to ensure balance and representativeness in corpus design. Second, assessment methods allow corpus linguists to understand the language used by speakers and writers more fully, particularly with regards to language development but also in terms of the (perceived) real-world impact of language use.
Original languageEnglish (US)
Title of host publicationUsing Corpus Methods to Triangulate Linguistic Analysis
EditorsJesse Egbert, Paul Baker
ISBN (Electronic)9781315112466
ISBN (Print)9780367777050, 9781138082540
StatePublished - Sep 19 2019

Publication series

NameRoutledge Advances in Corpus Linguistics


Dive into the research topics of 'Connecting Corpus Linguistics and Assessment'. Together they form a unique fingerprint.

Cite this