TY - GEN
T1 - Low-resource spoken keyword search strategies in georgian inspired by distinctive feature theory
AU - Chen, Nancy F.
AU - Lim, Boon Pang
AU - Do, Van Hai
AU - Pham, Van Tung
AU - Ni, Chongjia
AU - Xu, Haihua
AU - Hasegawajohnson, Mark
AU - Chen, Wenda
AU - Xiao, Xiong
AU - Sivadas, Sunil
AU - Chng, Eng Siong
AU - Ma, Bin
AU - Li, Haizhou
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/2
Y1 - 2017/7/2
N2 - We present low-resource spoken keyword search (KWS) strategies guided by distinctive feature theory in linguistics to conduct data selection, feature selection, and transcription augmentation. These strategies were employed in the context of the 2016 NIST Open Keyword Search Evaluation (OpenKWS16) using conversational Georgian from the IARPA Babel program. In particular, we elaborate on the following: (1) We exploit glottal-source-related acoustic features that characterize Georgian ejective phonemes ([+constricted glottis], [+raised larynx ejective] specified in distinctive feature theory). These features complement standard acoustic features, leading to a relative fusion gain of 11.9%. (2) We use noisy channel models to incorporate probabilistic phonetic transcriptions from mismatched crowdsourcing to conduct transfer learning to improve KWS for extremely under-resourced conditions (24 min of transcribed Georgian), achieving a relative improvement of 118% over the baseline and a relative fusion gain of 32%.(3) Using distinctive feature analysis, we select a compact subset of source languages used in past evaluations to ensure high phonetic coverage for cross-lingual acoustic modeling when only limited system development time and computational resources are available. This strategy leads to comparable performance to using all available linguistic resources when only 1/3 of the source languages were chosen.
AB - We present low-resource spoken keyword search (KWS) strategies guided by distinctive feature theory in linguistics to conduct data selection, feature selection, and transcription augmentation. These strategies were employed in the context of the 2016 NIST Open Keyword Search Evaluation (OpenKWS16) using conversational Georgian from the IARPA Babel program. In particular, we elaborate on the following: (1) We exploit glottal-source-related acoustic features that characterize Georgian ejective phonemes ([+constricted glottis], [+raised larynx ejective] specified in distinctive feature theory). These features complement standard acoustic features, leading to a relative fusion gain of 11.9%. (2) We use noisy channel models to incorporate probabilistic phonetic transcriptions from mismatched crowdsourcing to conduct transfer learning to improve KWS for extremely under-resourced conditions (24 min of transcribed Georgian), achieving a relative improvement of 118% over the baseline and a relative fusion gain of 32%.(3) Using distinctive feature analysis, we select a compact subset of source languages used in past evaluations to ensure high phonetic coverage for cross-lingual acoustic modeling when only limited system development time and computational resources are available. This strategy leads to comparable performance to using all available linguistic resources when only 1/3 of the source languages were chosen.
UR - http://www.scopus.com/inward/record.url?scp=85050497715&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85050497715&partnerID=8YFLogxK
U2 - 10.1109/APSIPA.2017.8282237
DO - 10.1109/APSIPA.2017.8282237
M3 - Conference contribution
AN - SCOPUS:85050497715
T3 - Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
SP - 1322
EP - 1327
BT - Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
Y2 - 12 December 2017 through 15 December 2017
ER -