Emotion recognition from speech via boosted Gaussian mixture models

Hao Tang, Stephen M. Chu, Mark Hasegawa-Johnson, Thomas S. Huang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Gaussian mixture models (GMMs) and the minimum error rate classifier (i.e. Bayesian optimal classifier) are popular and effective tools for speech emotion recognition. Typically, GMMs are used to model the class-conditional distributions of acoustic features and their parameters are estimated by the expectation maximization (EM) algorithm based on a training data set. Then, classification is performed to minimize the classification error w.r.t. the estimated class-conditional distributions. We call this method the EM-GMM algorithm. In this paper, we introduce a boosting algorithm for reliably and accurately estimating the class-conditional GMMs. The resulting algorithm is named the Boosted-GMM algorithm. Our speech emotion recognition experiments show that the emotion recognition rates are effectively and significantly "boosted" by the Boosted-GMM algorithm as compared to the EM-GMM algorithm. This is due to the fact that the boosting algorithm can lead to more accurate estimates of the class-conditional GMMs, namely the class-conditional distributions of acoustic features.

Original languageEnglish (US)
Title of host publicationProceedings - 2009 IEEE International Conference on Multimedia and Expo, ICME 2009
Pages294-297
Number of pages4
DOIs
StatePublished - 2009
Event2009 IEEE International Conference on Multimedia and Expo, ICME 2009 - New York, NY, United States
Duration: Jun 28 2009Jul 3 2009

Publication series

NameProceedings - 2009 IEEE International Conference on Multimedia and Expo, ICME 2009

Other

Other2009 IEEE International Conference on Multimedia and Expo, ICME 2009
Country/TerritoryUnited States
CityNew York, NY
Period6/28/097/3/09

Keywords

  • Bayesian optimal classifier
  • Boosting
  • EM algorithm
  • Emotion recognition
  • Gaussian mixture model

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Computer Networks and Communications
  • Hardware and Architecture
  • Software

Fingerprint

Dive into the research topics of 'Emotion recognition from speech via boosted Gaussian mixture models'. Together they form a unique fingerprint.

Cite this