TY - GEN
T1 - Multimodal human emotion/expression recognition
AU - Chen, Lawrence S.
AU - Huang, Thomas S.
AU - Miyasato, Tsutomu
AU - Nakatsu, Ryohei
PY - 1998
Y1 - 1998
N2 - Recognizing human facial expression and emotion by computer is an interesting and challenging problem. Many have investigated emotional contents in speech alone, or recognition of human facial expressions solely from images. However, relatively little has been done in combining these two modalities for recognizing human emotions. L.C. De Silva et al. (1997) studied human subjects' ability to recognize emotions from viewing video clips of facial expressions and listening to the corresponding emotional speech stimuli. They found that humans recognize some emotions better by audio information, and other emotions better by video. They also proposed an algorithm to integrate both kinds of inputs to mimic human's recognition process. While attempting to implement the algorithm, we encountered difficulties which led us to a different approach. We found these two modalities to be complimentary. By using both, we show it is possible to achieve higher recognition rates than either modality alone.
AB - Recognizing human facial expression and emotion by computer is an interesting and challenging problem. Many have investigated emotional contents in speech alone, or recognition of human facial expressions solely from images. However, relatively little has been done in combining these two modalities for recognizing human emotions. L.C. De Silva et al. (1997) studied human subjects' ability to recognize emotions from viewing video clips of facial expressions and listening to the corresponding emotional speech stimuli. They found that humans recognize some emotions better by audio information, and other emotions better by video. They also proposed an algorithm to integrate both kinds of inputs to mimic human's recognition process. While attempting to implement the algorithm, we encountered difficulties which led us to a different approach. We found these two modalities to be complimentary. By using both, we show it is possible to achieve higher recognition rates than either modality alone.
UR - http://www.scopus.com/inward/record.url?scp=84905387860&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84905387860&partnerID=8YFLogxK
U2 - 10.1109/AFGR.1998.670976
DO - 10.1109/AFGR.1998.670976
M3 - Conference contribution
AN - SCOPUS:84905387860
SN - 0818683449
SN - 9780818683442
T3 - Proceedings - 3rd IEEE International Conference on Automatic Face and Gesture Recognition, FG 1998
SP - 366
EP - 371
BT - Proceedings - 3rd IEEE International Conference on Automatic Face and Gesture Recognition, FG 1998
PB - IEEE Computer Society
T2 - 3rd IEEE International Conference on Automatic Face and Gesture Recognition, FG 1998
Y2 - 14 April 1998 through 16 April 1998
ER -