TY - GEN
T1 - Contrastive Learning Relies More on Spatial Inductive Bias Than Supervised Learning
T2 - 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023
AU - Zhong, Yuanyi
AU - Tang, Haoran
AU - Chen, Jun Kun
AU - Wang, Yu Xiong
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Though self-supervised contrastive learning (CL) has shown its potential to achieve state-of-the-art accuracy without any supervision, its behavior still remains under-investigated. Different from most previous work that understands CL from learning objectives, we focus on an unexplored yet natural aspect: the spatial inductive bias which seems to be implicitly exploited via data augmentations in CL. We design an experiment to study the reliance of CL on such spatial inductive bias, by destroying the global or local spatial structures of an image with global or local patch shuffling, and comparing the performance drop between experiments on original and corrupted dataset to quantify the reliance on certain inductive bias. We also use the uniformity of feature space to further research how CL-pre-trained models behave with the corrupted dataset. Our results and analysis show that CL has a much higher reliance on spatial inductive bias than SL, regardless of specific CL algorithm or backbones, opening a new direction for studying the behavior of CL.
AB - Though self-supervised contrastive learning (CL) has shown its potential to achieve state-of-the-art accuracy without any supervision, its behavior still remains under-investigated. Different from most previous work that understands CL from learning objectives, we focus on an unexplored yet natural aspect: the spatial inductive bias which seems to be implicitly exploited via data augmentations in CL. We design an experiment to study the reliance of CL on such spatial inductive bias, by destroying the global or local spatial structures of an image with global or local patch shuffling, and comparing the performance drop between experiments on original and corrupted dataset to quantify the reliance on certain inductive bias. We also use the uniformity of feature space to further research how CL-pre-trained models behave with the corrupted dataset. Our results and analysis show that CL has a much higher reliance on spatial inductive bias than SL, regardless of specific CL algorithm or backbones, opening a new direction for studying the behavior of CL.
UR - http://www.scopus.com/inward/record.url?scp=85188257218&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85188257218&partnerID=8YFLogxK
U2 - 10.1109/ICCV51070.2023.01496
DO - 10.1109/ICCV51070.2023.01496
M3 - Conference contribution
AN - SCOPUS:85188257218
T3 - Proceedings of the IEEE International Conference on Computer Vision
SP - 16281
EP - 16290
BT - Proceedings - 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 2 October 2023 through 6 October 2023
ER -