TY - GEN
T1 - FracBNN
T2 - 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA 2021
AU - Zhang, Yichi
AU - Pan, Junhao
AU - Liu, Xinheng
AU - Chen, Hongzheng
AU - Chen, Deming
AU - Zhang, Zhiru
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/2/17
Y1 - 2021/2/17
N2 - Binary neural networks (BNNs) have 1-bit weights and activations. Such networks are well suited for FPGAs, as their dominant computations are bitwise arithmetic and the memory requirement is also significantly reduced. However, compared to start-of-the-art compact convolutional neural network (CNN) models, BNNs tend to produce a much lower accuracy on realistic datasets such as ImageNet. In addition, the input layer of BNNs has gradually become a major compute bottleneck, because it is conventionally excluded from binarization to avoid a large accuracy loss. This work proposes FracBNN, which exploits fractional activations to substantially improve the accuracy of BNNs. Specifically, our approach employs a dual-precision activation scheme to compute features with up to two bits, using an additional sparse binary convolution. We further binarize the input layer using a novel thermometer encoding. Overall, FracBNN preserves the key benefits of conventional BNNs, where all convolutional layers are computed in pure binary MAC operations (BMACs). We design an efficient FPGA-based accelerator for our novel BNN model that supports the fractional activations. To evaluate the performance of FracBNN under a resource-constrained scenario, we implement the entire optimized network architecture on an embedded FPGA (Xilinx Ultra96 v2). Our experiments on ImageNet show that FracBNN achieves an accuracy comparable to MobileNetV2, surpassing the best-known BNN design on FPGAs with an increase of 28.9% in top-1 accuracy and a 2.5x reduction in model size. FracBNN also outperforms a recently introduced BNN model with an increase of 2.4% in top-1 accuracy while using the same model size. On the embedded FPGA device, FracBNN demonstrates the ability of real-time image classification.
AB - Binary neural networks (BNNs) have 1-bit weights and activations. Such networks are well suited for FPGAs, as their dominant computations are bitwise arithmetic and the memory requirement is also significantly reduced. However, compared to start-of-the-art compact convolutional neural network (CNN) models, BNNs tend to produce a much lower accuracy on realistic datasets such as ImageNet. In addition, the input layer of BNNs has gradually become a major compute bottleneck, because it is conventionally excluded from binarization to avoid a large accuracy loss. This work proposes FracBNN, which exploits fractional activations to substantially improve the accuracy of BNNs. Specifically, our approach employs a dual-precision activation scheme to compute features with up to two bits, using an additional sparse binary convolution. We further binarize the input layer using a novel thermometer encoding. Overall, FracBNN preserves the key benefits of conventional BNNs, where all convolutional layers are computed in pure binary MAC operations (BMACs). We design an efficient FPGA-based accelerator for our novel BNN model that supports the fractional activations. To evaluate the performance of FracBNN under a resource-constrained scenario, we implement the entire optimized network architecture on an embedded FPGA (Xilinx Ultra96 v2). Our experiments on ImageNet show that FracBNN achieves an accuracy comparable to MobileNetV2, surpassing the best-known BNN design on FPGAs with an increase of 28.9% in top-1 accuracy and a 2.5x reduction in model size. FracBNN also outperforms a recently introduced BNN model with an increase of 2.4% in top-1 accuracy while using the same model size. On the embedded FPGA device, FracBNN demonstrates the ability of real-time image classification.
KW - Binary neural networks
KW - Deep learning
KW - FPGA accelerators
KW - High-level synthesis
UR - http://www.scopus.com/inward/record.url?scp=85102042278&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102042278&partnerID=8YFLogxK
U2 - 10.1145/3431920.3439296
DO - 10.1145/3431920.3439296
M3 - Conference contribution
AN - SCOPUS:85102042278
T3 - FPGA 2021 - 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays
SP - 171
EP - 182
BT - FPGA 2021 - 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays
PB - Association for Computing Machinery
Y2 - 28 February 2021 through 2 March 2021
ER -