Multi-stream confidence analysis for audio-visual affect recognition

Zhihong Zeng, Juin Tu, Ming Liu, Thomas S Huang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Changes in a speaker's emotion are a fundamental component in human communication. Some emotions motivate human actions while others add deeper meaning and richness to human interactions. In this paper, we explore the development of a computing algorithm that uses audio and visual sensors to recognize a speaker's affective state. Within the framework of Multi-stream Hidden Markov Model (MHMM), we analyze audio and visual observations to detect 11 cognitive/emotive states. We investigate the use of individual modality confidence measures as a means of estimating weights when combining likelihoods in the audio-visual decision fusion. Person-independent experimental results from 20 subjects in 660 sequences suggest that the use of stream exponents estimated on training data results in classification accuracy improvement of audio-visual affect recognition.

Original languageEnglish (US)
Title of host publicationAffective Computing and Intelligent Interaction - First International Conference, ACII 2005, Proceedings
Pages964-971
Number of pages8
DOIs
StatePublished - Dec 1 2005
Event1st International Conference on ffective Computing and Intelligent Interaction, ACII 2005 - Beijing, China
Duration: Oct 22 2005Oct 24 2005

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3784 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other1st International Conference on ffective Computing and Intelligent Interaction, ACII 2005
Country/TerritoryChina
CityBeijing
Period10/22/0510/24/05

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Multi-stream confidence analysis for audio-visual affect recognition'. Together they form a unique fingerprint.

Cite this