TY - JOUR
T1 - Dreaming to distill
T2 - 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020
AU - Yin, Hongxu
AU - Molchanov, Pavlo
AU - Alvarez, Jose M.
AU - Li, Zhizhong
AU - Mallya, Arun
AU - Hoiem, Derek
AU - Jha, Niraj K.
AU - Kautz, Jan
N1 - Funding Information:
Work supported in part by ONR MURI N00014-16-1-2007.
Funding Information:
∗Equal contribution. : Work done during an internship at NVIDIA. Work supported in part by ONR MURI N00014-16-1-2007.
Publisher Copyright:
© 2020 IEEE.
PY - 2020
Y1 - 2020
N2 - We introduce DeepInversion, a new method for synthesizing images from the image distribution used to train a deep neural network. We "invert"a trained network (teacher) to synthesize class-conditional input images starting from random noise, without using any additional information about the training dataset. Keeping the teacher fixed, our method optimizes the input while regularizing the distribution of intermediate feature maps using information stored in the batch normalization layers of the teacher. Further, we improve the diversity of synthesized images using Adaptive DeepInversion, which maximizes the Jensen-Shannon divergence between the teacher and student network logits. The resulting synthesized images from networks trained on the CIFAR-10 and ImageNet datasets demonstrate high fidelity and degree of realism, and help enable a new breed of data-free applications - ones that do not require any real images or labeled data. We demonstrate the applicability of our proposed method to three tasks of immense practical importance - (i) data-free network pruning, (ii) data-free knowledge transfer, and (iii) data-free continual learning.
AB - We introduce DeepInversion, a new method for synthesizing images from the image distribution used to train a deep neural network. We "invert"a trained network (teacher) to synthesize class-conditional input images starting from random noise, without using any additional information about the training dataset. Keeping the teacher fixed, our method optimizes the input while regularizing the distribution of intermediate feature maps using information stored in the batch normalization layers of the teacher. Further, we improve the diversity of synthesized images using Adaptive DeepInversion, which maximizes the Jensen-Shannon divergence between the teacher and student network logits. The resulting synthesized images from networks trained on the CIFAR-10 and ImageNet datasets demonstrate high fidelity and degree of realism, and help enable a new breed of data-free applications - ones that do not require any real images or labeled data. We demonstrate the applicability of our proposed method to three tasks of immense practical importance - (i) data-free network pruning, (ii) data-free knowledge transfer, and (iii) data-free continual learning.
UR - http://www.scopus.com/inward/record.url?scp=85094570515&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85094570515&partnerID=8YFLogxK
U2 - 10.1109/CVPR42600.2020.00874
DO - 10.1109/CVPR42600.2020.00874
M3 - Conference article
AN - SCOPUS:85094570515
SN - 1063-6919
SP - 8712
EP - 8721
JO - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
JF - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
M1 - 9156864
Y2 - 14 June 2020 through 19 June 2020
ER -