TY - GEN
T1 - Towards optimal quantization of neural networks
AU - Chatterjee, Avhishek
AU - Varshney, Lav R.
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/8/9
Y1 - 2017/8/9
N2 - Due to the unprecedented success of deep neural networks in inference tasks like speech and image recognition, there has been increasing interest in using them in mobile and in-sensor applications. As most current deep neural networks are very large in size, a major challenge lies in storing the network in devices with limited memory. Consequently there is growing interest in compressing deep networks by quantizing synaptic weights, but most prior work is heuristic and lacking theoretical foundations. Here we develop an approach to quantizing deep networks using functional high-rate quantization theory. Under certain technical conditions, this approach leads to an optimal quantizer that is computed using the celebrated backpropagation algorithm. In all other cases, a heuristic quantizer with certain regularization guarantees can be computed.
AB - Due to the unprecedented success of deep neural networks in inference tasks like speech and image recognition, there has been increasing interest in using them in mobile and in-sensor applications. As most current deep neural networks are very large in size, a major challenge lies in storing the network in devices with limited memory. Consequently there is growing interest in compressing deep networks by quantizing synaptic weights, but most prior work is heuristic and lacking theoretical foundations. Here we develop an approach to quantizing deep networks using functional high-rate quantization theory. Under certain technical conditions, this approach leads to an optimal quantizer that is computed using the celebrated backpropagation algorithm. In all other cases, a heuristic quantizer with certain regularization guarantees can be computed.
KW - Deep neural network
KW - Quantization theory
UR - http://www.scopus.com/inward/record.url?scp=85034099972&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85034099972&partnerID=8YFLogxK
U2 - 10.1109/ISIT.2017.8006711
DO - 10.1109/ISIT.2017.8006711
M3 - Conference contribution
AN - SCOPUS:85034099972
T3 - IEEE International Symposium on Information Theory - Proceedings
SP - 1162
EP - 1166
BT - 2017 IEEE International Symposium on Information Theory, ISIT 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 IEEE International Symposium on Information Theory, ISIT 2017
Y2 - 25 June 2017 through 30 June 2017
ER -