Abstract
This paper addresses the problem of finding a subset of the acoustic feature space that best represents the phoneme set used in a speech recognition system. A maximum mutual information approach is presented for selecting acoustic features to be combined together to represent the distinctions among the phonemes. The overall phoneme recognition accuracy is slightly increased for the same length of feature vector for clean speech and at 10 dB compared to FFT-based Mel-frequency cepstrum coefficients (MFCC) by using acoustic features selected based on a maximum mutual information criterion. Using 16 different feature sets, the rank of the feature sets based on mutual information can predict phoneme recognition accuracy with a correlation coefficient of 0.71 compared to a correlation coefficient of 0.28 when using a criterion based on the average pair-wise Kullback-Liebler divergence to rank the feature sets.
Original language | English (US) |
---|---|
Pages | 2129-2132 |
Number of pages | 4 |
State | Published - 2002 |
Event | 7th International Conference on Spoken Language Processing, ICSLP 2002 - Denver, United States Duration: Sep 16 2002 → Sep 20 2002 |
Other
Other | 7th International Conference on Spoken Language Processing, ICSLP 2002 |
---|---|
Country/Territory | United States |
City | Denver |
Period | 9/16/02 → 9/20/02 |
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language