Mismatched Crowdsourcing based Language Perception for Under-resourced Languages

Wenda Chen, Mark Allan Hasegawa-Johnson, Nancy F. Chen

Research output: Contribution to journalConference article

Abstract

Mismatched crowdsourcing is a technique for acquiring automatic speech recognizer training data in under-resourced languages by decoding the transcriptions of workers who don't know the target language using a noisy-channel model of cross-language speech perception. All previous mismatched crowdsourcing studies have used English transcribers; this study is the first to recruit transcribers with a different native language, in this case, Mandarin Chinese. Using these data we are able to compute statistical models of cross-language perception of the tones and phonemes from transcribers based on phone distinctive features and tone features. By analyzing the phonetic and tonal variation mappings and coverages compared with the dictionary of the target language, we evaluate the different native languages' effect on the transcribers' performances.

Original languageEnglish (US)
Pages (from-to)23-29
Number of pages7
JournalProcedia Computer Science
Volume81
DOIs
StatePublished - Jan 1 2016
Event5th Workshop on Spoken Language Technologies for Under-resourced languages, SLTU 2016 - Yogyakarta, Indonesia
Duration: May 9 2016May 12 2016

Fingerprint

Speech analysis
Transcription
Glossaries
Decoding
Statistical Models

Keywords

  • Low Resource Language
  • Mismatched Crowdsourcing
  • Speech Perception
  • Speech Recognition

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Mismatched Crowdsourcing based Language Perception for Under-resourced Languages. / Chen, Wenda; Hasegawa-Johnson, Mark Allan; Chen, Nancy F.

In: Procedia Computer Science, Vol. 81, 01.01.2016, p. 23-29.

Research output: Contribution to journalConference article

@article{4f169ff099054d9cb39a0049b7f37f7b,
title = "Mismatched Crowdsourcing based Language Perception for Under-resourced Languages",
abstract = "Mismatched crowdsourcing is a technique for acquiring automatic speech recognizer training data in under-resourced languages by decoding the transcriptions of workers who don't know the target language using a noisy-channel model of cross-language speech perception. All previous mismatched crowdsourcing studies have used English transcribers; this study is the first to recruit transcribers with a different native language, in this case, Mandarin Chinese. Using these data we are able to compute statistical models of cross-language perception of the tones and phonemes from transcribers based on phone distinctive features and tone features. By analyzing the phonetic and tonal variation mappings and coverages compared with the dictionary of the target language, we evaluate the different native languages' effect on the transcribers' performances.",
keywords = "Low Resource Language, Mismatched Crowdsourcing, Speech Perception, Speech Recognition",
author = "Wenda Chen and Hasegawa-Johnson, {Mark Allan} and Chen, {Nancy F.}",
year = "2016",
month = "1",
day = "1",
doi = "10.1016/j.procs.2016.04.025",
language = "English (US)",
volume = "81",
pages = "23--29",
journal = "Procedia Computer Science",
issn = "1877-0509",
publisher = "Elsevier BV",

}

TY - JOUR

T1 - Mismatched Crowdsourcing based Language Perception for Under-resourced Languages

AU - Chen, Wenda

AU - Hasegawa-Johnson, Mark Allan

AU - Chen, Nancy F.

PY - 2016/1/1

Y1 - 2016/1/1

N2 - Mismatched crowdsourcing is a technique for acquiring automatic speech recognizer training data in under-resourced languages by decoding the transcriptions of workers who don't know the target language using a noisy-channel model of cross-language speech perception. All previous mismatched crowdsourcing studies have used English transcribers; this study is the first to recruit transcribers with a different native language, in this case, Mandarin Chinese. Using these data we are able to compute statistical models of cross-language perception of the tones and phonemes from transcribers based on phone distinctive features and tone features. By analyzing the phonetic and tonal variation mappings and coverages compared with the dictionary of the target language, we evaluate the different native languages' effect on the transcribers' performances.

AB - Mismatched crowdsourcing is a technique for acquiring automatic speech recognizer training data in under-resourced languages by decoding the transcriptions of workers who don't know the target language using a noisy-channel model of cross-language speech perception. All previous mismatched crowdsourcing studies have used English transcribers; this study is the first to recruit transcribers with a different native language, in this case, Mandarin Chinese. Using these data we are able to compute statistical models of cross-language perception of the tones and phonemes from transcribers based on phone distinctive features and tone features. By analyzing the phonetic and tonal variation mappings and coverages compared with the dictionary of the target language, we evaluate the different native languages' effect on the transcribers' performances.

KW - Low Resource Language

KW - Mismatched Crowdsourcing

KW - Speech Perception

KW - Speech Recognition

UR - http://www.scopus.com/inward/record.url?scp=84976447899&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84976447899&partnerID=8YFLogxK

U2 - 10.1016/j.procs.2016.04.025

DO - 10.1016/j.procs.2016.04.025

M3 - Conference article

AN - SCOPUS:84976447899

VL - 81

SP - 23

EP - 29

JO - Procedia Computer Science

JF - Procedia Computer Science

SN - 1877-0509

ER -