AccDNN: An IP-Based DNN Generator for FPGAs

Xiaofan Zhang, Junsong Wang, Chao Zhu, Yonghua Lin, Jinjun Xiong, Wen-Mei W Hwu, Deming Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Using FPGA to accelerate Deep Neural Networks (DNNs) requires RTL programming, hardware verification, and precise resource allocation, which is both time-consuming and challenging. To address this issue, we present AccDNN, an end-to-end automation tool that can generate high-performance DNN designs on FPGAs automatically. Highlights of this tool include high-quality RTL network layer IPs, a fine-grained layer-based pipeline architecture, and a column-based cache scheme for high throughput, low latency, and reduced on-chip memory utilization. AccDNN also includes an automatic design space exploration tool, called A-REALM, used to generate optimized parallelism schemes by considering external memory access bandwidth, data reuse behaviors, resource availability, and network complexity. We demonstrate AccDNN on four DNNs (Alexnet, ZF, VGG16, and YOLO) on two Xilinx FPGAs (ZC706 and KU115) for edge- and cloud-computing, respectively. AccDNN generates designs that deliver 263 GOPS and 36.4 GOPS/W on ZC706 without any batching and 2109 GOPS and 94.5 GOPS/W on KU115.

Original languageEnglish (US)
Title of host publicationProceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages1
ISBN (Electronic)9781538655221
DOIs
StatePublished - Sep 7 2018
Event26th Annual IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018 - Boulder, United States
Duration: Apr 29 2018May 1 2018

Publication series

NameProceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018

Other

Other26th Annual IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018
CountryUnited States
CityBoulder
Period4/29/185/1/18

Fingerprint

Field programmable gate arrays (FPGA)
Data storage equipment
Network layers
Cloud computing
Resource allocation
Automation
Pipelines
Throughput
Availability
Hardware
Bandwidth
Deep neural networks

Keywords

  • Acceleration
  • Automation tool
  • Deep Neural Network
  • FPGA

ASJC Scopus subject areas

  • Artificial Intelligence
  • Hardware and Architecture
  • Software

Cite this

Zhang, X., Wang, J., Zhu, C., Lin, Y., Xiong, J., Hwu, W-M. W., & Chen, D. (2018). AccDNN: An IP-Based DNN Generator for FPGAs. In Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018 [8457659] (Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/FCCM.2018.00044

AccDNN : An IP-Based DNN Generator for FPGAs. / Zhang, Xiaofan; Wang, Junsong; Zhu, Chao; Lin, Yonghua; Xiong, Jinjun; Hwu, Wen-Mei W; Chen, Deming.

Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018. Institute of Electrical and Electronics Engineers Inc., 2018. 8457659 (Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhang, X, Wang, J, Zhu, C, Lin, Y, Xiong, J, Hwu, W-MW & Chen, D 2018, AccDNN: An IP-Based DNN Generator for FPGAs. in Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018., 8457659, Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018, Institute of Electrical and Electronics Engineers Inc., 26th Annual IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018, Boulder, United States, 4/29/18. https://doi.org/10.1109/FCCM.2018.00044
Zhang X, Wang J, Zhu C, Lin Y, Xiong J, Hwu W-MW et al. AccDNN: An IP-Based DNN Generator for FPGAs. In Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018. Institute of Electrical and Electronics Engineers Inc. 2018. 8457659. (Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018). https://doi.org/10.1109/FCCM.2018.00044
Zhang, Xiaofan ; Wang, Junsong ; Zhu, Chao ; Lin, Yonghua ; Xiong, Jinjun ; Hwu, Wen-Mei W ; Chen, Deming. / AccDNN : An IP-Based DNN Generator for FPGAs. Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018. Institute of Electrical and Electronics Engineers Inc., 2018. (Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018).
@inproceedings{972e88ec6cad44479fdde5e37e2a2610,
title = "AccDNN: An IP-Based DNN Generator for FPGAs",
abstract = "Using FPGA to accelerate Deep Neural Networks (DNNs) requires RTL programming, hardware verification, and precise resource allocation, which is both time-consuming and challenging. To address this issue, we present AccDNN, an end-to-end automation tool that can generate high-performance DNN designs on FPGAs automatically. Highlights of this tool include high-quality RTL network layer IPs, a fine-grained layer-based pipeline architecture, and a column-based cache scheme for high throughput, low latency, and reduced on-chip memory utilization. AccDNN also includes an automatic design space exploration tool, called A-REALM, used to generate optimized parallelism schemes by considering external memory access bandwidth, data reuse behaviors, resource availability, and network complexity. We demonstrate AccDNN on four DNNs (Alexnet, ZF, VGG16, and YOLO) on two Xilinx FPGAs (ZC706 and KU115) for edge- and cloud-computing, respectively. AccDNN generates designs that deliver 263 GOPS and 36.4 GOPS/W on ZC706 without any batching and 2109 GOPS and 94.5 GOPS/W on KU115.",
keywords = "Acceleration, Automation tool, Deep Neural Network, FPGA",
author = "Xiaofan Zhang and Junsong Wang and Chao Zhu and Yonghua Lin and Jinjun Xiong and Hwu, {Wen-Mei W} and Deming Chen",
year = "2018",
month = "9",
day = "7",
doi = "10.1109/FCCM.2018.00044",
language = "English (US)",
series = "Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
booktitle = "Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018",
address = "United States",

}

TY - GEN

T1 - AccDNN

T2 - An IP-Based DNN Generator for FPGAs

AU - Zhang, Xiaofan

AU - Wang, Junsong

AU - Zhu, Chao

AU - Lin, Yonghua

AU - Xiong, Jinjun

AU - Hwu, Wen-Mei W

AU - Chen, Deming

PY - 2018/9/7

Y1 - 2018/9/7

N2 - Using FPGA to accelerate Deep Neural Networks (DNNs) requires RTL programming, hardware verification, and precise resource allocation, which is both time-consuming and challenging. To address this issue, we present AccDNN, an end-to-end automation tool that can generate high-performance DNN designs on FPGAs automatically. Highlights of this tool include high-quality RTL network layer IPs, a fine-grained layer-based pipeline architecture, and a column-based cache scheme for high throughput, low latency, and reduced on-chip memory utilization. AccDNN also includes an automatic design space exploration tool, called A-REALM, used to generate optimized parallelism schemes by considering external memory access bandwidth, data reuse behaviors, resource availability, and network complexity. We demonstrate AccDNN on four DNNs (Alexnet, ZF, VGG16, and YOLO) on two Xilinx FPGAs (ZC706 and KU115) for edge- and cloud-computing, respectively. AccDNN generates designs that deliver 263 GOPS and 36.4 GOPS/W on ZC706 without any batching and 2109 GOPS and 94.5 GOPS/W on KU115.

AB - Using FPGA to accelerate Deep Neural Networks (DNNs) requires RTL programming, hardware verification, and precise resource allocation, which is both time-consuming and challenging. To address this issue, we present AccDNN, an end-to-end automation tool that can generate high-performance DNN designs on FPGAs automatically. Highlights of this tool include high-quality RTL network layer IPs, a fine-grained layer-based pipeline architecture, and a column-based cache scheme for high throughput, low latency, and reduced on-chip memory utilization. AccDNN also includes an automatic design space exploration tool, called A-REALM, used to generate optimized parallelism schemes by considering external memory access bandwidth, data reuse behaviors, resource availability, and network complexity. We demonstrate AccDNN on four DNNs (Alexnet, ZF, VGG16, and YOLO) on two Xilinx FPGAs (ZC706 and KU115) for edge- and cloud-computing, respectively. AccDNN generates designs that deliver 263 GOPS and 36.4 GOPS/W on ZC706 without any batching and 2109 GOPS and 94.5 GOPS/W on KU115.

KW - Acceleration

KW - Automation tool

KW - Deep Neural Network

KW - FPGA

UR - http://www.scopus.com/inward/record.url?scp=85057752883&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057752883&partnerID=8YFLogxK

U2 - 10.1109/FCCM.2018.00044

DO - 10.1109/FCCM.2018.00044

M3 - Conference contribution

AN - SCOPUS:85057752883

T3 - Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018

BT - Proceedings - 26th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2018

PB - Institute of Electrical and Electronics Engineers Inc.

ER -