Audio-visual affect recognition through multi-stream fused HMM for HCI

Zhihong Zeng, Jilin Tu, Brian Pianfetti, Ming Liu, Tong Zhang, Zhenqiu Zhang, Thomas S Huang, Stephen Levinson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Advances in computer processing power and emerging algorithms are allowing new ways of envisioning Human Computer Interaction. This paper focuses on the development of a computing algorithm that uses audio and visual sensors to detect and track a user's affective state to aid computer decision making. Using our Multi-stream Fused Hidden Markov Model (MFHMM), we analyzed coupled audio and visual streams to detect 11 cognitive/emotive states. The MFHMM allows the building of an optimal connection among multiple streams according to the maximum entropy principle and the maximum mutual information criterion. Person-independent experimental results from 20 subjects in 660 sequences show that the MFHMM approach performs with an accuracy of 80.61% which outperforms face-only HMM, pitch-only HMM, energy-only HMM, and independent HMM fusion.

Original languageEnglish (US)
Title of host publicationProceedings - 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005
PublisherIEEE Computer Society
Pages967-972
Number of pages6
ISBN (Print)0769523722, 9780769523729
DOIs
StatePublished - 2005
Externally publishedYes
Event2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005 - San Diego, CA, United States
Duration: Jun 20 2005Jun 25 2005

Publication series

NameProceedings - 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005
VolumeII

Other

Other2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005
Country/TerritoryUnited States
CitySan Diego, CA
Period6/20/056/25/05

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint

Dive into the research topics of 'Audio-visual affect recognition through multi-stream fused HMM for HCI'. Together they form a unique fingerprint.

Cite this