TY - GEN
T1 - Development of a TV broadcasts speech recognition system for Qatari Arabic
AU - Elmahdy, Mohamed
AU - Hasegawa-Johnson, Mark
AU - Mustafawi, Eiman
N1 - Funding Information:
This publication was made possible by a grant from the Qatar National Research Fund under its National Priorities Research Program (NPRP) award number NPRP 09-410-1-069. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the Qatar National Research Fund. We would like also to acknowledge the European Language Resources Association (ELRA) and the Linguistic Data Consortium (LDC) for providing us with data resources.
PY - 2014
Y1 - 2014
N2 - A major problem with dialectal Arabic speech recognition is due to the sparsity of speech resources. In this paper, a transfer learning framework is proposed to jointly use a large amount of Modern Standard Arabic (MSA) data and little amount of dialectal Arabic data to improve acoustic and language modeling. The Qatari Arabic (QA) dialect has been chosen as a typical example for an under-resourced Arabic dialect. A wide-band speech corpus has been collected and transcribed from several Qatari TV series and talk-show programs. A large vocabulary speech recognition baseline system was built using the QA corpus. The proposed MSA-based transfer learning technique was performed by applying orthographic normalization, phone mapping, data pooling, acoustic model adaptation, and system combination. The proposed approach can achieve more than 28% relative reduction in WER.
AB - A major problem with dialectal Arabic speech recognition is due to the sparsity of speech resources. In this paper, a transfer learning framework is proposed to jointly use a large amount of Modern Standard Arabic (MSA) data and little amount of dialectal Arabic data to improve acoustic and language modeling. The Qatari Arabic (QA) dialect has been chosen as a typical example for an under-resourced Arabic dialect. A wide-band speech corpus has been collected and transcribed from several Qatari TV series and talk-show programs. A large vocabulary speech recognition baseline system was built using the QA corpus. The proposed MSA-based transfer learning technique was performed by applying orthographic normalization, phone mapping, data pooling, acoustic model adaptation, and system combination. The proposed approach can achieve more than 28% relative reduction in WER.
KW - Dialectal Arabic
KW - Speech recognition
KW - Transfer learning
UR - http://www.scopus.com/inward/record.url?scp=84942813989&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84942813989&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84942813989
T3 - Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014
SP - 3057
EP - 3061
BT - Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014
A2 - Calzolari, Nicoletta
A2 - Choukri, Khalid
A2 - Goggi, Sara
A2 - Declerck, Thierry
A2 - Mariani, Joseph
A2 - Maegaard, Bente
A2 - Moreno, Asuncion
A2 - Odijk, Jan
A2 - Mazo, Helene
A2 - Piperidis, Stelios
A2 - Loftsson, Hrafn
PB - European Language Resources Association (ELRA)
T2 - 9th International Conference on Language Resources and Evaluation, LREC 2014
Y2 - 26 May 2014 through 31 May 2014
ER -