TY - GEN
T1 - FPGA/DNN co-design
T2 - 56th Annual Design Automation Conference, DAC 2019
AU - Hao, Cong
AU - Zhang, Xiaofan
AU - Li, Yuhong
AU - Huang, Sitao
AU - Xiong, Jinjun
AU - Rupnow, Kyle
AU - Hwu, Wen Mei
AU - Chen, Deming
N1 - Publisher Copyright:
© 2019 Association for Computing Machinery.
PY - 2019/6/2
Y1 - 2019/6/2
N2 - While embedded FPGAs are attractive platforms for DNN acceleration on edge-devices due to their low latency and high energy efficiency, the scarcity of resources of edge-scale FPGA devices also makes it challenging for DNN deployment. In this paper, we propose a simultaneous FPGA/DNN co-design methodology with both bottom-up and top-down approaches: a bottom-up hardwareoriented DNN model search for high accuracy, and a top-down FPGA accelerator design considering DNN-specific characteristics. We also build an automatic co-design flow, including an Auto-DNN engine to perform hardware-oriented DNN model search, as well as an Auto-HLS engine to generate synthesizable C code of the FPGA accelerator for explored DNNs. We demonstrate our co-design approach on an object detection task using PYNQ-Z1 FPGA. Results show that our proposed DNN model and accelerator outperform the state-of-the-art FPGA designs in all aspects including Intersectionover- Union (IoU) (6.2% higher), frames per second (FPS) (2.48× higher), power consumption (40% lower), and energy efficiency (2.5× higher). Compared to GPU-based solutions, our designs deliver similar accuracy but consume far less energy.
AB - While embedded FPGAs are attractive platforms for DNN acceleration on edge-devices due to their low latency and high energy efficiency, the scarcity of resources of edge-scale FPGA devices also makes it challenging for DNN deployment. In this paper, we propose a simultaneous FPGA/DNN co-design methodology with both bottom-up and top-down approaches: a bottom-up hardwareoriented DNN model search for high accuracy, and a top-down FPGA accelerator design considering DNN-specific characteristics. We also build an automatic co-design flow, including an Auto-DNN engine to perform hardware-oriented DNN model search, as well as an Auto-HLS engine to generate synthesizable C code of the FPGA accelerator for explored DNNs. We demonstrate our co-design approach on an object detection task using PYNQ-Z1 FPGA. Results show that our proposed DNN model and accelerator outperform the state-of-the-art FPGA designs in all aspects including Intersectionover- Union (IoU) (6.2% higher), frames per second (FPS) (2.48× higher), power consumption (40% lower), and energy efficiency (2.5× higher). Compared to GPU-based solutions, our designs deliver similar accuracy but consume far less energy.
UR - http://www.scopus.com/inward/record.url?scp=85067818581&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85067818581&partnerID=8YFLogxK
U2 - 10.1145/3316781.3317829
DO - 10.1145/3316781.3317829
M3 - Conference contribution
AN - SCOPUS:85067818581
T3 - Proceedings - Design Automation Conference
BT - Proceedings of the 56th Annual Design Automation Conference 2019, DAC 2019
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 2 June 2019 through 6 June 2019
ER -