TY - GEN
T1 - Select-additive learning
T2 - 2017 IEEE International Conference on Multimedia and Expo, ICME 2017
AU - Wang, Haohan
AU - Meghawat, Aaksha
AU - Morency, Louis Philippe
AU - Xing, Eric P.
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/8/28
Y1 - 2017/8/28
N2 - Multimodal sentiment analysis is drawing an increasing amount of attention these days. It enables mining of opinions in video reviews which are now available aplenty on online platforms. However, multimodal sentiment analysis has only a few high-quality data sets annotated for training machine learning algorithms. These limited resources restrict the generalizability of models, where, for example, the unique characteristics of a few speakers (e.g., wearing glasses) may become a confounding factor for the sentiment classification task. In this paper, we propose a Select-Additive Learning (SAL) procedure that improves the generalizability of trained neural networks for multimodal sentiment analysis. In our experiments, we show that our SAL approach improves prediction accuracy significantly in all three modalities (verbal, acoustic, visual), as well as in their fusion. Our results show that SAL, even when trained on one dataset, achieves good generalization across two new test datasets.
AB - Multimodal sentiment analysis is drawing an increasing amount of attention these days. It enables mining of opinions in video reviews which are now available aplenty on online platforms. However, multimodal sentiment analysis has only a few high-quality data sets annotated for training machine learning algorithms. These limited resources restrict the generalizability of models, where, for example, the unique characteristics of a few speakers (e.g., wearing glasses) may become a confounding factor for the sentiment classification task. In this paper, we propose a Select-Additive Learning (SAL) procedure that improves the generalizability of trained neural networks for multimodal sentiment analysis. In our experiments, we show that our SAL approach improves prediction accuracy significantly in all three modalities (verbal, acoustic, visual), as well as in their fusion. Our results show that SAL, even when trained on one dataset, achieves good generalization across two new test datasets.
KW - Cross-datasets
KW - Cross-individual
KW - Generalization
KW - Multimodal
KW - Sentiment analysis
UR - http://www.scopus.com/inward/record.url?scp=85030243094&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85030243094&partnerID=8YFLogxK
U2 - 10.1109/ICME.2017.8019301
DO - 10.1109/ICME.2017.8019301
M3 - Conference contribution
AN - SCOPUS:85030243094
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
SP - 949
EP - 954
BT - 2017 IEEE International Conference on Multimedia and Expo, ICME 2017
PB - IEEE Computer Society
Y2 - 10 July 2017 through 14 July 2017
ER -