T-DLA: An Open-source Deep Learning Accelerator for Ternarized DNN Models on Embedded FPGA

Yao Chen, Kai Zhang, Cheng Gong, Cong Hao, Xiaofan Zhang, Tao Li, Deming Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Deep Neural Networks (DNNs) have become promising solutions for data analysis especially for raw data processing from sensors. However, using DNN-based approaches can easily introduce huge demands of computation and memory consumption, which may not be feasible for direct deployment onto the Internet of Thing (IoT) devices, since they have strict constraints on hardware resources, power budgets, response latency, and manufacturing cost. To bring DNNs into IoT devices, embedded FPGA can be one of the most suitable candidates by providing better energy efficiency than GPU and CPU based solutions, and higher flexibility than ASICs. In this paper, we propose a systematic solution to deploy DNNs on embedded FPGAs, which includes a ternarized hardware Deep Learning Accelerator (T-DLA), and a framework for ternary neural network (TNN) training. T-DLA is a highly optimized hardware unit in FPGA specializing in accelerating the TNNs, while the proposed framework can significantly compress the DNN parameters down to two bits with little accuracy drop. Results show that our training framework can compress the DNN up to 14.14x while maintaining nearly the same accuracy compared to the floating point version. By illustrating our proposed design techniques, the T-DLA can deliver up to 0.4TOPS with 2.576W power consumption, showing 873.6x and 5.1x higher energy efficiency (fps/W) on ImageNet with Resnet-18 model comparing to Xeon E5-2630 CPU and Nvidia 1080 Ti GPU. To the best of our knowledge, this is the first instruction-based highly efficient ternary DLA design reported from the literature.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 IEEE Computer Society Annual Symposium on VLSI, ISVLSI 2019
PublisherIEEE Computer Society
Pages13-18
Number of pages6
ISBN (Electronic)9781538670996
DOIs
StatePublished - Jul 2019
Event18th IEEE Computer Society Annual Symposium on VLSI, ISVLSI 2019 - Miami, United States
Duration: Jul 15 2019Jul 17 2019

Publication series

NameProceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI
Volume2019-July
ISSN (Print)2159-3469
ISSN (Electronic)2159-3477

Conference

Conference18th IEEE Computer Society Annual Symposium on VLSI, ISVLSI 2019
CountryUnited States
CityMiami
Period7/15/197/17/19

Keywords

  • Deep learning accelerator
  • Deep neural networks
  • Embedded FPGA
  • Multi clock domain
  • Ternary neural network

ASJC Scopus subject areas

  • Hardware and Architecture
  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'T-DLA: An Open-source Deep Learning Accelerator for Ternarized DNN Models on Embedded FPGA'. Together they form a unique fingerprint.

Cite this