TY - GEN
T1 - Person identification based on multichannel and multimodality fusion
AU - Liu, Ming
AU - Tang, Hao
AU - Ning, Huazhong
AU - Huang, Thomas
PY - 2007
Y1 - 2007
N2 - Person ID is a very useful information for high level video analysis and retrieval. In some scenario, the recording is not only multimodality and also multichannel(microphone array, camera array). In this paper, we describe a Multimodal person ID system base on multichannel and multimodal fusion. The audio only system is combining 7 channel microphone recording at decision output individual audio-only system. The modeling technique of audio system is Universal Background Model(UBM) and Maximum a Posterior adaptation framework which is very popular in speaker recognition literature. The visual only system works directly on the appearance space via Zi norm and nearest neighbor classifier. The linear fusion is then combining the two modalities to improve the ID performance. The experiments indicate the effectiviness of micropohone array fusion and audio/visual fusion.
AB - Person ID is a very useful information for high level video analysis and retrieval. In some scenario, the recording is not only multimodality and also multichannel(microphone array, camera array). In this paper, we describe a Multimodal person ID system base on multichannel and multimodal fusion. The audio only system is combining 7 channel microphone recording at decision output individual audio-only system. The modeling technique of audio system is Universal Background Model(UBM) and Maximum a Posterior adaptation framework which is very popular in speaker recognition literature. The visual only system works directly on the appearance space via Zi norm and nearest neighbor classifier. The linear fusion is then combining the two modalities to improve the ID performance. The experiments indicate the effectiviness of micropohone array fusion and audio/visual fusion.
UR - http://www.scopus.com/inward/record.url?scp=38049168628&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=38049168628&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-69568-4_21
DO - 10.1007/978-3-540-69568-4_21
M3 - Conference contribution
AN - SCOPUS:38049168628
SN - 9783540695677
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 241
EP - 248
BT - Multimodal Technologies for Perception of Humans - First International Evaluation Workshop on Classification of Events, Activities and Relationships, CLEAR 2006 Revised Selected Papers
PB - Springer
T2 - 1st International Evaluation Workshop on Classification of Events, Activities and Relationships, CLEAR 2006
Y2 - 6 April 2006 through 7 April 2006
ER -