TY - GEN
T1 - Recognizing zero-resourced languages based on mismatched machine transcriptions
AU - Chen, Wenda
AU - Hasegawa-Johnson, Mark
AU - Chen, Nancy F.
N1 - Funding Information:
∗This work is funded by the Agency for Science, Technology and Research (A*STAR) Graduate Scholarship.
Publisher Copyright:
© 2018 IEEE.
PY - 2018/9/10
Y1 - 2018/9/10
N2 - Mismatched crowdsourcing based probabilistic human transcription has been proposed recently for training and adapting acoustic models for zero-resourced languages where we do not have any native transcriptions. This paper describes a machine transcription based phone recognition system for recognizing zero-resourced languages and compares it with baseline systems of MAP adaptation and semi-supervised self training. With a set of available speech recognizers in source languages that cover all the basic phonetic features, this work shows that we can use mismatched machine transcriptions from these source languages to achieve human level transcriptions, bypassing the laborious efforts of obtaining human transcriptions. We also present a fully automated unsupervised approach for zero-resourced speech recognition using mismatched machine transcriptions for transfer learning of phone models.
AB - Mismatched crowdsourcing based probabilistic human transcription has been proposed recently for training and adapting acoustic models for zero-resourced languages where we do not have any native transcriptions. This paper describes a machine transcription based phone recognition system for recognizing zero-resourced languages and compares it with baseline systems of MAP adaptation and semi-supervised self training. With a set of available speech recognizers in source languages that cover all the basic phonetic features, this work shows that we can use mismatched machine transcriptions from these source languages to achieve human level transcriptions, bypassing the laborious efforts of obtaining human transcriptions. We also present a fully automated unsupervised approach for zero-resourced speech recognition using mismatched machine transcriptions for transfer learning of phone models.
KW - Automatic speech recognition (ASR)
KW - Mismatched machine transcription
KW - Modular system
KW - Transfer learning
KW - Zero-resourced languages
UR - http://www.scopus.com/inward/record.url?scp=85054220595&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85054220595&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2018.8462481
DO - 10.1109/ICASSP.2018.8462481
M3 - Conference contribution
AN - SCOPUS:85054220595
SN - 9781538646588
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 5979
EP - 5983
BT - 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018
Y2 - 15 April 2018 through 20 April 2018
ER -