TY - GEN
T1 - An audio-visual fusion framework with joint dimensionality reduction
AU - Liu, Ming
AU - Fu, Yun
AU - Huang, Thomas S.
PY - 2008
Y1 - 2008
N2 - By combining audio and visual modalities, the speech recognition systems achieve higher performance and robustness. The fusion strategies to this point are mainly three types: feature level fusion, model level fusion, and decision level fusion. In this paper, we present a novel audio-visual fusion framework, in which a joint dimensionality reduction approach is used to project the audio and visual features into more compact subspaces. With correlation preserving criteria, the representations of projected audio and visual features will be able to preserve the correlation conveyed in the original audio and visual feature space. At the same time, the better model efficiency is achieved in the more compact feature spaces. The experiments on audio-visual person verification demonstrate the efficiency and effectiveness of the proposed fusion framework.
AB - By combining audio and visual modalities, the speech recognition systems achieve higher performance and robustness. The fusion strategies to this point are mainly three types: feature level fusion, model level fusion, and decision level fusion. In this paper, we present a novel audio-visual fusion framework, in which a joint dimensionality reduction approach is used to project the audio and visual features into more compact subspaces. With correlation preserving criteria, the representations of projected audio and visual features will be able to preserve the correlation conveyed in the original audio and visual feature space. At the same time, the better model efficiency is achieved in the more compact feature spaces. The experiments on audio-visual person verification demonstrate the efficiency and effectiveness of the proposed fusion framework.
KW - Audio-visual fusion
KW - Audio-visual person verification
KW - Canonical correlation analysis
KW - Dimensionality reduction
UR - http://www.scopus.com/inward/record.url?scp=51449116849&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=51449116849&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2008.4518640
DO - 10.1109/ICASSP.2008.4518640
M3 - Conference contribution
AN - SCOPUS:51449116849
SN - 1424414849
SN - 9781424414840
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 4437
EP - 4440
BT - 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
T2 - 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
Y2 - 31 March 2008 through 4 April 2008
ER -