TY - JOUR
T1 - VecQ
T2 - Minimal Loss DNN Model Compression with Vectorized Weight Quantization
AU - Gong, Cheng
AU - Chen, Yao
AU - Lu, Ye
AU - Li, Tao
AU - Hao, Cong
AU - Chen, Deming
N1 - Funding Information:
This work was supported in part by the National Natural Science Foundation (61872200), in part by the National Key Research and Development Program of China (2018YFB2100304, 2018YFB1003405), in part by the Natural Science Foundation of Tianjin (19JCZDJC31600, 19JCQNJC00600), and in part by the Open Project Fund of State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences (CARCH201905). It is also supported in part by the National Research Foundation, Prime Minister’s Office, Singapore under its Campus for Research Excellence and Technological Enterprise (CREATE) programme, and in part by the IBM-Illinois Center for Cognitive Computing System Research (C3SR)—a research collaboration as part of IBM AI Horizons Network.
Publisher Copyright:
© 1968-2012 IEEE.
PY - 2021/5/1
Y1 - 2021/5/1
N2 - Quantization has been proven to be an effective method for reducing the computing and/or storage cost of DNNs. However, the trade-off between the quantization bitwidth and final accuracy is complex and non-convex, which makes it difficult to be optimized directly. Minimizing direct quantization loss (DQL) of the coefficient data is an effective local optimization method, but previous works often neglect the accurate control of the DQL, resulting in a higher loss of the final DNN model accuracy. In this paper, we propose a novel metric, called Vector Loss. Using this new metric, we decompose the minimization of the DQL to two independent optimization processes, which significantly outperform the traditional iterative L2 loss minimization process in terms of effectiveness, quantization loss as well as final DNN accuracy. We also develop a new DNN quantization solution called VecQ, which provides minimal direct quantization loss and achieve higher model accuracy. In order to speed up the proposed quantization process during model training, we accelerate the quantization process with a parameterized probability estimation method and template-based derivation calculation. We evaluate our proposed algorithm on MNIST, CIFAR, ImageNet, IMDB movie review and THUCNews text data sets with numerical DNN models. The results demonstrate that our proposed quantization solution is more accurate and effective than the state-of-the-art approaches yet with more flexible bitwidth support. Moreover, the evaluation of our quantized models on Salient Object Detection (SOD) tasks maintains comparable feature extraction quality with up to 16× weight size reduction.
AB - Quantization has been proven to be an effective method for reducing the computing and/or storage cost of DNNs. However, the trade-off between the quantization bitwidth and final accuracy is complex and non-convex, which makes it difficult to be optimized directly. Minimizing direct quantization loss (DQL) of the coefficient data is an effective local optimization method, but previous works often neglect the accurate control of the DQL, resulting in a higher loss of the final DNN model accuracy. In this paper, we propose a novel metric, called Vector Loss. Using this new metric, we decompose the minimization of the DQL to two independent optimization processes, which significantly outperform the traditional iterative L2 loss minimization process in terms of effectiveness, quantization loss as well as final DNN accuracy. We also develop a new DNN quantization solution called VecQ, which provides minimal direct quantization loss and achieve higher model accuracy. In order to speed up the proposed quantization process during model training, we accelerate the quantization process with a parameterized probability estimation method and template-based derivation calculation. We evaluate our proposed algorithm on MNIST, CIFAR, ImageNet, IMDB movie review and THUCNews text data sets with numerical DNN models. The results demonstrate that our proposed quantization solution is more accurate and effective than the state-of-the-art approaches yet with more flexible bitwidth support. Moreover, the evaluation of our quantized models on Salient Object Detection (SOD) tasks maintains comparable feature extraction quality with up to 16× weight size reduction.
KW - DNN compression
KW - DNN quantization
KW - low bitwidth
KW - vector loss
KW - vectorized weight quantization
UR - http://www.scopus.com/inward/record.url?scp=85104082228&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85104082228&partnerID=8YFLogxK
U2 - 10.1109/TC.2020.2995593
DO - 10.1109/TC.2020.2995593
M3 - Article
AN - SCOPUS:85104082228
SN - 0018-9340
VL - 70
SP - 696
EP - 710
JO - IEEE Transactions on Computers
JF - IEEE Transactions on Computers
IS - 5
M1 - 9095420
ER -