TY - GEN
T1 - A graph-based hybrid framework for modeling complex heterogeneity
AU - Yang, Pei
AU - He, Jingrui
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2016/1/5
Y1 - 2016/1/5
N2 - Data heterogeneity is an intrinsic property of many high impact applications, such as insider threat detection, traffic prediction, brain image analysis, quality control in manufacturing processes, etc. Furthermore, multiple types of heterogeneity (e.g., task/view/instance heterogeneity) often co-exist in these applications, thus pose new challenges to existing techniques, most of which are tailored for a single or dual types of heterogeneity. To address this problem, in this paper, we propose a novel graph-based hybrid approach to simultaneously model multiple types of heterogeneity in a principled framework. The objective is to maximize the smoothness consistency of the neighboring nodes, bag-instance correlation together with task relatedness on the hybrid graphs, and simultaneously minimize the empirical classification loss. Furthermore, we analyze its performance based on Rademacher complexity, which sheds light on the benefits of jointly modeling multiple types of heterogeneity. To solve the resulting non-convex non-smooth problem, we propose an iterative algorithm named M3 Learning, which combines block coordinate descent and the bundle method for optimization. Experimental results on various data sets show the effectiveness of the proposed algorithm.
AB - Data heterogeneity is an intrinsic property of many high impact applications, such as insider threat detection, traffic prediction, brain image analysis, quality control in manufacturing processes, etc. Furthermore, multiple types of heterogeneity (e.g., task/view/instance heterogeneity) often co-exist in these applications, thus pose new challenges to existing techniques, most of which are tailored for a single or dual types of heterogeneity. To address this problem, in this paper, we propose a novel graph-based hybrid approach to simultaneously model multiple types of heterogeneity in a principled framework. The objective is to maximize the smoothness consistency of the neighboring nodes, bag-instance correlation together with task relatedness on the hybrid graphs, and simultaneously minimize the empirical classification loss. Furthermore, we analyze its performance based on Rademacher complexity, which sheds light on the benefits of jointly modeling multiple types of heterogeneity. To solve the resulting non-convex non-smooth problem, we propose an iterative algorithm named M3 Learning, which combines block coordinate descent and the bundle method for optimization. Experimental results on various data sets show the effectiveness of the proposed algorithm.
KW - Heterogeneous learning
KW - Multi-instance learning
KW - Multi-task learning
KW - Multi-view learning
UR - http://www.scopus.com/inward/record.url?scp=84963522428&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84963522428&partnerID=8YFLogxK
U2 - 10.1109/ICDM.2015.109
DO - 10.1109/ICDM.2015.109
M3 - Conference contribution
AN - SCOPUS:84963522428
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 1081
EP - 1086
BT - Proceedings - 15th IEEE International Conference on Data Mining, ICDM 2015
A2 - Aggarwal, Charu
A2 - Zhou, Zhi-Hua
A2 - Tuzhilin, Alexander
A2 - Xiong, Hui
A2 - Wu, Xindong
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 15th IEEE International Conference on Data Mining, ICDM 2015
Y2 - 14 November 2015 through 17 November 2015
ER -