TY - GEN
T1 - Generative Modeling for Multi-task Visual Learning
AU - Bao, Zhipeng
AU - Hebert, Martial
AU - Wang, Yu Xiong
N1 - Acknowledgement: This work was supported in part by ONR MURI N000014-16-1-2007 and AFRL Grant FA23861714660. YXW was supported in part by NSF Grant 2106825, the Jump ARCHES endowment through the Health Care Engineering Systems Center, and the New Frontiers Initiative.
This work was supported in part by ONR MURI N000014-16-1-2007 and AFRL Grant FA23861714660. YXW was supported in part by NSF Grant 2106825, the Jump ARCHES endowment through the Health Care Engineering Systems Center, and the New Frontiers Initiative.
PY - 2022
Y1 - 2022
N2 - Generative modeling has recently shown great promise in computer vision, but it has mostly focused on synthesizing visually realistic images. In this paper, motivated by multi-task learning of shareable feature representations, we consider a novel problem of learning a shared generative model that is useful across various visual perception tasks. Correspondingly, we propose a general multi-task oriented generative modeling (MGM) framework, by coupling a discriminative multi-task network with a generative network. While it is challenging to synthesize both RGB images and pixel-level annotations in multi-task scenarios, our framework enables us to use synthesized images paired with only weak annotations (i.e., image-level scene labels) to facilitate multiple visual tasks. Experimental evaluation on challenging multi-task benchmarks, including NYUv2 and Taskonomy, demonstrates that our MGM framework improves the performance of all the tasks by large margins, consistently outperforming state-of-the-art multi-task approaches in different sample-size regimes.
AB - Generative modeling has recently shown great promise in computer vision, but it has mostly focused on synthesizing visually realistic images. In this paper, motivated by multi-task learning of shareable feature representations, we consider a novel problem of learning a shared generative model that is useful across various visual perception tasks. Correspondingly, we propose a general multi-task oriented generative modeling (MGM) framework, by coupling a discriminative multi-task network with a generative network. While it is challenging to synthesize both RGB images and pixel-level annotations in multi-task scenarios, our framework enables us to use synthesized images paired with only weak annotations (i.e., image-level scene labels) to facilitate multiple visual tasks. Experimental evaluation on challenging multi-task benchmarks, including NYUv2 and Taskonomy, demonstrates that our MGM framework improves the performance of all the tasks by large margins, consistently outperforming state-of-the-art multi-task approaches in different sample-size regimes.
UR - https://www.scopus.com/pages/publications/85163095168
UR - https://www.scopus.com/pages/publications/85163095168#tab=citedBy
M3 - Conference contribution
T3 - Proceedings of Machine Learning Research
SP - 1537
EP - 1554
BT - Proceedings of the 39th International Conference on Machine Learning
A2 - Chaudhuri, Kamalika
A2 - Jegelka, Stefanie
A2 - Song, Le
A2 - Szepesvari, Csaba
A2 - Niu, Gang
A2 - Sabato, Sivan
PB - PMLR
T2 - 39th International Conference on Machine Learning, ICML 2022
Y2 - 17 July 2022 through 23 July 2022
ER -