A comparison of grammatical proficiency measures in the automated assessment of spontaneous speech

Su Youn Yoon, Suma Pallathadka Bhat

Research output: Contribution to journalArticle

Abstract

We developed new measures that assess the level of grammatical proficiency for an automated speech proficiency scoring system. The new measures assess the range and sophistication in grammar usage based on natural language processing technology and a large corpus of learners’ spoken responses. First, we automatically identified a set of grammatical expressions associated with each proficiency level from the corpus. Next, we predicted the level of grammatical proficiency based on the similarity in the grammatical expression distribution between a learner's response and the corpus. We evaluated the strength of the association between the new measures and proficiency levels using spontaneous responses from an international English language assessment. The Pearson correlation test results showed that compared to commonly used syntactic complexity measures the proposed measures had stronger relationships with proficiency. We also explored the impact of system errors from a multi-stage automated process and found that the new measures were robust against the errors. Finally, we developed an automated scoring model which predicted the holistic oral proficiency scores. The new measures led to statistically significant improvement in agreement between human and machine scores over the previous system.

Original languageEnglish (US)
Pages (from-to)221-230
Number of pages10
JournalSpeech Communication
Volume99
DOIs
StatePublished - May 2018

Fingerprint

Syntactics
Scoring
English language
grammar
Processing
Pearson Correlation
Complexity Measure
language
Grammar
Natural Language
Speech
Proficiency
Spontaneous Speech
Range of data
Corpus
Model

Keywords

  • Automated scoring
  • Grammatical development
  • Natural language processing
  • Similarity measures
  • Syntactic complexity measures

ASJC Scopus subject areas

  • Software
  • Modeling and Simulation
  • Communication
  • Language and Linguistics
  • Linguistics and Language
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Cite this

A comparison of grammatical proficiency measures in the automated assessment of spontaneous speech. / Yoon, Su Youn; Bhat, Suma Pallathadka.

In: Speech Communication, Vol. 99, 05.2018, p. 221-230.

Research output: Contribution to journalArticle

@article{74a0193dfee84cb69cbb72834dc76d0e,
title = "A comparison of grammatical proficiency measures in the automated assessment of spontaneous speech",
abstract = "We developed new measures that assess the level of grammatical proficiency for an automated speech proficiency scoring system. The new measures assess the range and sophistication in grammar usage based on natural language processing technology and a large corpus of learners’ spoken responses. First, we automatically identified a set of grammatical expressions associated with each proficiency level from the corpus. Next, we predicted the level of grammatical proficiency based on the similarity in the grammatical expression distribution between a learner's response and the corpus. We evaluated the strength of the association between the new measures and proficiency levels using spontaneous responses from an international English language assessment. The Pearson correlation test results showed that compared to commonly used syntactic complexity measures the proposed measures had stronger relationships with proficiency. We also explored the impact of system errors from a multi-stage automated process and found that the new measures were robust against the errors. Finally, we developed an automated scoring model which predicted the holistic oral proficiency scores. The new measures led to statistically significant improvement in agreement between human and machine scores over the previous system.",
keywords = "Automated scoring, Grammatical development, Natural language processing, Similarity measures, Syntactic complexity measures",
author = "Yoon, {Su Youn} and Bhat, {Suma Pallathadka}",
year = "2018",
month = "5",
doi = "10.1016/j.specom.2018.04.003",
language = "English (US)",
volume = "99",
pages = "221--230",
journal = "Speech Communication",
issn = "0167-6393",
publisher = "Elsevier",

}

TY - JOUR

T1 - A comparison of grammatical proficiency measures in the automated assessment of spontaneous speech

AU - Yoon, Su Youn

AU - Bhat, Suma Pallathadka

PY - 2018/5

Y1 - 2018/5

N2 - We developed new measures that assess the level of grammatical proficiency for an automated speech proficiency scoring system. The new measures assess the range and sophistication in grammar usage based on natural language processing technology and a large corpus of learners’ spoken responses. First, we automatically identified a set of grammatical expressions associated with each proficiency level from the corpus. Next, we predicted the level of grammatical proficiency based on the similarity in the grammatical expression distribution between a learner's response and the corpus. We evaluated the strength of the association between the new measures and proficiency levels using spontaneous responses from an international English language assessment. The Pearson correlation test results showed that compared to commonly used syntactic complexity measures the proposed measures had stronger relationships with proficiency. We also explored the impact of system errors from a multi-stage automated process and found that the new measures were robust against the errors. Finally, we developed an automated scoring model which predicted the holistic oral proficiency scores. The new measures led to statistically significant improvement in agreement between human and machine scores over the previous system.

AB - We developed new measures that assess the level of grammatical proficiency for an automated speech proficiency scoring system. The new measures assess the range and sophistication in grammar usage based on natural language processing technology and a large corpus of learners’ spoken responses. First, we automatically identified a set of grammatical expressions associated with each proficiency level from the corpus. Next, we predicted the level of grammatical proficiency based on the similarity in the grammatical expression distribution between a learner's response and the corpus. We evaluated the strength of the association between the new measures and proficiency levels using spontaneous responses from an international English language assessment. The Pearson correlation test results showed that compared to commonly used syntactic complexity measures the proposed measures had stronger relationships with proficiency. We also explored the impact of system errors from a multi-stage automated process and found that the new measures were robust against the errors. Finally, we developed an automated scoring model which predicted the holistic oral proficiency scores. The new measures led to statistically significant improvement in agreement between human and machine scores over the previous system.

KW - Automated scoring

KW - Grammatical development

KW - Natural language processing

KW - Similarity measures

KW - Syntactic complexity measures

UR - http://www.scopus.com/inward/record.url?scp=85045549515&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85045549515&partnerID=8YFLogxK

U2 - 10.1016/j.specom.2018.04.003

DO - 10.1016/j.specom.2018.04.003

M3 - Article

AN - SCOPUS:85045549515

VL - 99

SP - 221

EP - 230

JO - Speech Communication

JF - Speech Communication

SN - 0167-6393

ER -