TY - GEN
T1 - μl2Q
T2 - 2019 International Joint Conference on Neural Networks, IJCNN 2019
AU - Gong, Cheng
AU - Li, Tao
AU - Lu, Ye
AU - Hao, Cong
AU - Zhang, Xiaofan
AU - Chen, Deming
AU - Chen, Yao
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/7
Y1 - 2019/7
N2 - Data quantization has been proved to be an effective method to compress deep neural networks (DNNs) by using less bits to represent the parameters and intermediate data. The bit width of the data directly affects the memory footprint, computing capability, and energy consumption during the computation of the DNN models. Although there have been numerous existing studies on data quantization, there is still no quantitative analysis of the existing quantization methods, which results in empirical quantization with unpredictable DNN accuracy loss. To address this problem, we propose an effective method, called ultra-low loss quantization (μL2Q), to provide DNN quantization schemes based on comprehensive quantitative data analysis. μL2Q builds the transformation of the original data to a data space with standard normal distribution, and then find the optimal parameters to minimize the loss of the quantization of a targeted bit width. In addition, we integrate the proposed μL2Q into a popular machine learning framework Caffe for convenient end-to-end DNN design and training. By comparing to the state-of-the-art DNN compression designs, μL2Q shows the greatest ability to maintain DNN accuracy after quantization. In the experiments, our proposed method can deliver 4.42%, 16.70%, 1.95%, 8.26% and 5.63% accuracy improvements on Lenet-5, Cifarnet, VGG7-64 and Resnet-18 (Top1/5), respectively, compared to the state-of-the-art solutions with the same compression ratio.
AB - Data quantization has been proved to be an effective method to compress deep neural networks (DNNs) by using less bits to represent the parameters and intermediate data. The bit width of the data directly affects the memory footprint, computing capability, and energy consumption during the computation of the DNN models. Although there have been numerous existing studies on data quantization, there is still no quantitative analysis of the existing quantization methods, which results in empirical quantization with unpredictable DNN accuracy loss. To address this problem, we propose an effective method, called ultra-low loss quantization (μL2Q), to provide DNN quantization schemes based on comprehensive quantitative data analysis. μL2Q builds the transformation of the original data to a data space with standard normal distribution, and then find the optimal parameters to minimize the loss of the quantization of a targeted bit width. In addition, we integrate the proposed μL2Q into a popular machine learning framework Caffe for convenient end-to-end DNN design and training. By comparing to the state-of-the-art DNN compression designs, μL2Q shows the greatest ability to maintain DNN accuracy after quantization. In the experiments, our proposed method can deliver 4.42%, 16.70%, 1.95%, 8.26% and 5.63% accuracy improvements on Lenet-5, Cifarnet, VGG7-64 and Resnet-18 (Top1/5), respectively, compared to the state-of-the-art solutions with the same compression ratio.
UR - http://www.scopus.com/inward/record.url?scp=85073232958&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85073232958&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2019.8851699
DO - 10.1109/IJCNN.2019.8851699
M3 - Conference contribution
AN - SCOPUS:85073232958
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2019 International Joint Conference on Neural Networks, IJCNN 2019
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 14 July 2019 through 19 July 2019
ER -