Landmark-based pronunciation error identification on Chinese learning

Xuesong Yang, Xiang Kong, Mark Hasegawa-Johnson, Yanlu Xie

Research output: Contribution to journalConference articlepeer-review


This paper explores a novel approach of identifying pronunciation errors for the second language (L2) learners based on the landmark theory of human speech perception. Earlier works on the selection method of distinctive features and the likelihoodbased “goodness of pronunciation” (GOP) measurement have gained progress in several L2 languages, e.g. Dutch and English. However, the improvement of performance is limited due to error-prone automatic speech recognition (ASR) systems and less distinguishable features. Landmark theory posits the existence of quantal nonlinearities in the articulatory-acoustic relationship, and provides a basis of selecting landmark positions that are suitable for identifying pronunciation errors. By leveraging this English acoustic landmark theory, we propose to select Mandarin Chinese salient phonetic landmarks for the Top-16 frequently mispronounced phonemes by Japanese (L1) learners, and extract features at those landmarks including mel-frequency cepstral coefficients (MFCC) and formants. Both cross validation and evaluation are performed for individual phonemes using support vector machine with linear kernel. Experiments illustrate that our landmark-based approaches achieve higher micro-average f1 score significantly than GOPbased methods.

Original languageEnglish (US)
Pages (from-to)247-251
Number of pages5
JournalProceedings of the International Conference on Speech Prosody
StatePublished - 2016
Event8th Speech Prosody 2016 - Boston, United States
Duration: May 31 2016Jun 3 2016


  • Acoustic Landmarks
  • Distinctive Features
  • Pronunciation Error Identification
  • Second Language Acquisition

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language


Dive into the research topics of 'Landmark-based pronunciation error identification on Chinese learning'. Together they form a unique fingerprint.

Cite this