Data heterogeneity is an intrinsic property of many high impact applications, such as insider threat detection, traffic prediction, brain image analysis, quality control in manufacturing processes, etc. Furthermore, multiple types of heterogeneity (e.g., task/view/instance heterogeneity) often co-exist in these applications, thus pose new challenges to existing techniques, most of which are tailored for a single or dual types of heterogeneity. To address this problem, in this paper, we propose a novel graph-based hybrid approach to simultaneously model multiple types of heterogeneity in a principled framework. The objective is to maximize the smoothness consistency of the neighboring nodes, bag-instance correlation together with task relatedness on the hybrid graphs, and simultaneously minimize the empirical classification loss. Furthermore, we analyze its performance based on Rademacher complexity, which sheds light on the benefits of jointly modeling multiple types of heterogeneity. To solve the resulting non-convex non-smooth problem, we propose an iterative algorithm named M3 Learning, which combines block coordinate descent and the bundle method for optimization. Experimental results on various data sets show the effectiveness of the proposed algorithm.