A survey of affect recognition methods: Audio, visual, and spontaneous expressions

Zhihong Zeng, Maja Pantic, Glenn I. Roisman, Thomas S. Huang

Research output: Contribution to journalArticle


Automated analysis of human affective behavior has attracted increasing attention from researchers in psychology, computer science, linguistics, neuroscience, and related disciplines. However, the existing methods typically handle only deliberately displayed and exaggerated expressions of prototypical emotions despite the fact that deliberate behaviour differs in visual appearance, audio profile, and timing from spontaneously occurring behaviour. To address this problem, efforts to develop algorithms that can process naturally occurring human affective behaviour have recently emerged. Moreover, an increasing number of efforts are reported toward multimodal fusion for human affect analysis including audiovisual fusion, linguistic and paralinguistic fusion, and multi-cue visual fusion based on facial expressions, head movements, and body gestures. This paper introduces and surveys these recent advances. We first discuss human emotion perception from a psychological perspective. Next we examine available approaches to solving the problem of machine understanding of human affective behavior, and discuss important issues like the collection and availability of training and test data. We finally outline some of the scientific and engineering challenges to advancing human affect sensing technology.

Original languageEnglish (US)
Pages (from-to)39-58
Number of pages20
JournalIEEE transactions on pattern analysis and machine intelligence
Issue number1
StatePublished - Jan 1 2009


  • Evaluation/methodology
  • Human-centered computing
  • Introductory and Survey

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Computational Theory and Mathematics
  • Artificial Intelligence
  • Applied Mathematics

Fingerprint Dive into the research topics of 'A survey of affect recognition methods: Audio, visual, and spontaneous expressions'. Together they form a unique fingerprint.

  • Cite this