TY - GEN
T1 - Emotion recognition from speech via boosted Gaussian mixture models
AU - Tang, Hao
AU - Chu, Stephen M.
AU - Hasegawa-Johnson, Mark
AU - Huang, Thomas S.
PY - 2009
Y1 - 2009
N2 - Gaussian mixture models (GMMs) and the minimum error rate classifier (i.e. Bayesian optimal classifier) are popular and effective tools for speech emotion recognition. Typically, GMMs are used to model the class-conditional distributions of acoustic features and their parameters are estimated by the expectation maximization (EM) algorithm based on a training data set. Then, classification is performed to minimize the classification error w.r.t. the estimated class-conditional distributions. We call this method the EM-GMM algorithm. In this paper, we introduce a boosting algorithm for reliably and accurately estimating the class-conditional GMMs. The resulting algorithm is named the Boosted-GMM algorithm. Our speech emotion recognition experiments show that the emotion recognition rates are effectively and significantly "boosted" by the Boosted-GMM algorithm as compared to the EM-GMM algorithm. This is due to the fact that the boosting algorithm can lead to more accurate estimates of the class-conditional GMMs, namely the class-conditional distributions of acoustic features.
AB - Gaussian mixture models (GMMs) and the minimum error rate classifier (i.e. Bayesian optimal classifier) are popular and effective tools for speech emotion recognition. Typically, GMMs are used to model the class-conditional distributions of acoustic features and their parameters are estimated by the expectation maximization (EM) algorithm based on a training data set. Then, classification is performed to minimize the classification error w.r.t. the estimated class-conditional distributions. We call this method the EM-GMM algorithm. In this paper, we introduce a boosting algorithm for reliably and accurately estimating the class-conditional GMMs. The resulting algorithm is named the Boosted-GMM algorithm. Our speech emotion recognition experiments show that the emotion recognition rates are effectively and significantly "boosted" by the Boosted-GMM algorithm as compared to the EM-GMM algorithm. This is due to the fact that the boosting algorithm can lead to more accurate estimates of the class-conditional GMMs, namely the class-conditional distributions of acoustic features.
KW - Bayesian optimal classifier
KW - Boosting
KW - EM algorithm
KW - Emotion recognition
KW - Gaussian mixture model
UR - http://www.scopus.com/inward/record.url?scp=70449563206&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70449563206&partnerID=8YFLogxK
U2 - 10.1109/ICME.2009.5202493
DO - 10.1109/ICME.2009.5202493
M3 - Conference contribution
AN - SCOPUS:70449563206
SN - 9781424442911
T3 - Proceedings - 2009 IEEE International Conference on Multimedia and Expo, ICME 2009
SP - 294
EP - 297
BT - Proceedings - 2009 IEEE International Conference on Multimedia and Expo, ICME 2009
T2 - 2009 IEEE International Conference on Multimedia and Expo, ICME 2009
Y2 - 28 June 2009 through 3 July 2009
ER -