TY - GEN
T1 - Language coverage for mismatched crowdsourcing
AU - Varshney, Lav R.
AU - Jyothi, Preethi
AU - Hasegawa-Johnson, Mark
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2017/3/27
Y1 - 2017/3/27
N2 - Developing automatic speech recognition technologies requires transcribed speech so as to learn the mapping from sound to text. It is traditionally assumed that transcribers need to be native speakers of the language being transcribed. Mismatched crowdsourcing is the transcription of speech by crowd workers who do not speak the language. Given there are phonological similarities among different human languages, mismatched crowdsourcing does provide noisy data that can be aggregated to yield reliable labels. Here we discuss phonological properties of different languages in a coding-theoretic framework, and how nonnative phoneme misperception can be modeled as a noisy communication channel. We show the results of experiments demonstrating the efficacy of this information theory inspired modeling approach, having native English speakers and native Mandarin speakers transcribe Cantonese speech. Finally we discuss how crowd workers whose native language background give them the highest probability of faithful transcription can be found by solving a weighted set cover problem.
AB - Developing automatic speech recognition technologies requires transcribed speech so as to learn the mapping from sound to text. It is traditionally assumed that transcribers need to be native speakers of the language being transcribed. Mismatched crowdsourcing is the transcription of speech by crowd workers who do not speak the language. Given there are phonological similarities among different human languages, mismatched crowdsourcing does provide noisy data that can be aggregated to yield reliable labels. Here we discuss phonological properties of different languages in a coding-theoretic framework, and how nonnative phoneme misperception can be modeled as a noisy communication channel. We show the results of experiments demonstrating the efficacy of this information theory inspired modeling approach, having native English speakers and native Mandarin speakers transcribe Cantonese speech. Finally we discuss how crowd workers whose native language background give them the highest probability of faithful transcription can be found by solving a weighted set cover problem.
KW - channel selection
KW - distance distribution
KW - mismatched crowdsourcing
KW - phonology
KW - set cover
KW - speech transcription
UR - http://www.scopus.com/inward/record.url?scp=85013016573&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85013016573&partnerID=8YFLogxK
U2 - 10.1109/ITA.2016.7888198
DO - 10.1109/ITA.2016.7888198
M3 - Conference contribution
AN - SCOPUS:85013016573
T3 - 2016 Information Theory and Applications Workshop, ITA 2016
BT - 2016 Information Theory and Applications Workshop, ITA 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2016 Information Theory and Applications Workshop, ITA 2016
Y2 - 31 January 2016 through 5 February 2016
ER -