TY - CONF
T1 - Accumulation bit-width scaling for ultra-low precision training of deep networks
AU - Sakr, Charbel
AU - Wang, Naigang
AU - Chen, Chia Yu
AU - Choi, Jungwook
AU - Agrawal, Ankur
AU - Shanbhag, Naresh
AU - Gopalakrishnan, Kailash
N1 - Funding Information:
This work is supported in part by IBM Research; IBM SoftLayer; IBM Cognitive Computing Cluster (CCC); IBM-ILLINOIS Center for Cognitive Computing Systems Research (C3SR) - a research collaboration as part of the IBM AI Horizons Network; and C-BRIC, one of six centers in JUMP, a Semiconductor Research Corporation (SRC) program sponsored by DARPA. The authors would like to thank I-Hsin Chung, Ming-Hung Chen and Silvia Melitta Mueller for helpful discussions and support.
Publisher Copyright:
© 7th International Conference on Learning Representations, ICLR 2019. All Rights Reserved.
PY - 2019
Y1 - 2019
N2 - Efforts to reduce the numerical precision of computations in deep learning training have yielded systems that aggressively quantize weights and activations, yet employ wide high-precision accumulators for partial sums in inner-product operations to preserve the quality of convergence. The absence of any framework to analyze the precision requirements of partial sum accumulations results in conservative design choices. This imposes an upper-bound on the reduction of complexity of multiply-accumulate units. We present a statistical approach to analyze the impact of reduced accumulation precision on deep learning training. Observing that a bad choice for accumulation precision results in loss of information that manifests itself as a reduction in variance in an ensemble of partial sums, we derive a set of equations that relate this variance to the length of accumulation and the minimum number of bits needed for accumulation. We apply our analysis to three benchmark networks: CIFAR-10 ResNet 32, ImageNet ResNet 18 and ImageNet AlexNet. In each case, with accumulation precision set in accordance with our proposed equations, the networks successfully converge to the single precision floating-point baseline. We also show that reducing accumulation precision further degrades the quality of the trained network, proving that our equations produce tight bounds. Overall this analysis enables precise tailoring of computation hardware to the application, yielding area- and power-optimal systems.
AB - Efforts to reduce the numerical precision of computations in deep learning training have yielded systems that aggressively quantize weights and activations, yet employ wide high-precision accumulators for partial sums in inner-product operations to preserve the quality of convergence. The absence of any framework to analyze the precision requirements of partial sum accumulations results in conservative design choices. This imposes an upper-bound on the reduction of complexity of multiply-accumulate units. We present a statistical approach to analyze the impact of reduced accumulation precision on deep learning training. Observing that a bad choice for accumulation precision results in loss of information that manifests itself as a reduction in variance in an ensemble of partial sums, we derive a set of equations that relate this variance to the length of accumulation and the minimum number of bits needed for accumulation. We apply our analysis to three benchmark networks: CIFAR-10 ResNet 32, ImageNet ResNet 18 and ImageNet AlexNet. In each case, with accumulation precision set in accordance with our proposed equations, the networks successfully converge to the single precision floating-point baseline. We also show that reducing accumulation precision further degrades the quality of the trained network, proving that our equations produce tight bounds. Overall this analysis enables precise tailoring of computation hardware to the application, yielding area- and power-optimal systems.
UR - http://www.scopus.com/inward/record.url?scp=85083953936&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85083953936&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85083953936
T2 - 7th International Conference on Learning Representations, ICLR 2019
Y2 - 6 May 2019 through 9 May 2019
ER -