TY - GEN
T1 - Robust analysis and weighting on MFCC components for speech recognition and speaker identification
AU - Zhou, Xi
AU - Fu, Yun
AU - Liu, Ming
AU - Hasegawa-Johnson, Mark
AU - Huang, Thomas S.
PY - 2007
Y1 - 2007
N2 - Mismatch between training and testing data is a major error source for both Automatic Speech Recognition (ASR) and Automatic Speaker Identification (ASI). In this paper, we first present a statistical weighting concept to exploit the unequal sensitivity of Mel-Frequency Cepstral Coefficients (MFCC) components to against the mismatch, such as ambient noise, recording equipment, transmission channels, and inter-speaker variations. We further design a new Kullback-Leibler (KL) Distance based weighting algorithm according to the proposed weighting concept to real-world problems in which the label information is often not provided. We examine our algorithm in ASR with mismatch by different speakers and also in ASI with mismatch by channel noises. Experimental results demonstrate the effectiveness and robustness of our proposed method.
AB - Mismatch between training and testing data is a major error source for both Automatic Speech Recognition (ASR) and Automatic Speaker Identification (ASI). In this paper, we first present a statistical weighting concept to exploit the unequal sensitivity of Mel-Frequency Cepstral Coefficients (MFCC) components to against the mismatch, such as ambient noise, recording equipment, transmission channels, and inter-speaker variations. We further design a new Kullback-Leibler (KL) Distance based weighting algorithm according to the proposed weighting concept to real-world problems in which the label information is often not provided. We examine our algorithm in ASR with mismatch by different speakers and also in ASI with mismatch by channel noises. Experimental results demonstrate the effectiveness and robustness of our proposed method.
UR - http://www.scopus.com/inward/record.url?scp=46449092074&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=46449092074&partnerID=8YFLogxK
U2 - 10.1109/icme.2007.4284618
DO - 10.1109/icme.2007.4284618
M3 - Conference contribution
AN - SCOPUS:46449092074
SN - 1424410177
SN - 9781424410170
T3 - Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, ICME 2007
SP - 188
EP - 191
BT - Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, ICME 2007
PB - IEEE Computer Society
T2 - IEEE International Conference onMultimedia and Expo, ICME 2007
Y2 - 2 July 2007 through 5 July 2007
ER -