An evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognition

Mohamed Kamal Omar, Ken Chen, Mark Hasegawa-Johnson, Yigal Brandman

Research output: Contribution to conferencePaper

Abstract

This paper addresses the problem of finding a subset of the acoustic feature space that best represents the phoneme set used in a speech recognition system. A maximum mutual information approach is presented for selecting acoustic features to be combined together to represent the distinctions among the phonemes. The overall phoneme recognition accuracy is slightly increased for the same length of feature vector for clean speech and at 10 dB compared to FFT-based Mel-frequency cepstrum coefficients (MFCC) by using acoustic features selected based on a maximum mutual information criterion. Using 16 different feature sets, the rank of the feature sets based on mutual information can predict phoneme recognition accuracy with a correlation coefficient of 0.71 compared to a correlation coefficient of 0.28 when using a criterion based on the average pair-wise Kullback-Liebler divergence to rank the feature sets.

Original languageEnglish (US)
Pages2129-2132
Number of pages4
StatePublished - Jan 1 2002
Event7th International Conference on Spoken Language Processing, ICSLP 2002 - Denver, United States
Duration: Sep 16 2002Sep 20 2002

Other

Other7th International Conference on Spoken Language Processing, ICSLP 2002
CountryUnited States
CityDenver
Period9/16/029/20/02

    Fingerprint

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

Omar, M. K., Chen, K., Hasegawa-Johnson, M., & Brandman, Y. (2002). An evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognition. 2129-2132. Paper presented at 7th International Conference on Spoken Language Processing, ICSLP 2002, Denver, United States.