Mixed-signal charge-domain acceleration of deep neural networks through interleaved bit-partitioned arithmetic

Soroush Ghodrati, Hardik Sharma, Sean Kinzer, Amir Yazdanbakhsh, Jongse Park, Nam Sung Kim, Doug Burger, Hadi Esmaeilzadeh

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Albeit low-power, mixed-signal circuitry suffers from significantoverhead of Analog to Digital (A/D) conversion, limited range forinformation encoding, and susceptibility to noise. This paper aimsto address these challenges by offering and leveraging the following mathematical insight regarding vector dot-product-the basicoperator in Deep Neural Networks (DNNs). This operator can bereformulated as a wide regrouping of spatially parallel low-bitwidthcalculations that are interleaved across the bit partitions of multipleelements of the vectors. As such, the computational building blockof our accelerator becomes a wide bit-interleaved analog vectorunit comprising a collection of low-bitwidth multiply-accumulatemodules that operate in the analog domain and share a single A/Dconverter (ADC). This bit-partitioning results in a lower-resolutionADC while the wide regrouping alleviates the need for A/D conversion per operation, amortizing its cost across multiple bit-partitionsof the vector elements. Moreover, the low-bitwidth modules requiresmaller encoding range and also provide larger margins for noisemitigation. We also utilize the switched-capacitor design for ourbit-level reformulation of DNN operations. The proposed switchedcapacitor circuitry performs the regrouped multiplications in thecharge domain and accumulates the results of the group in its capacitors over multiple cycles. The capacitive accumulation combinedwith wide bit-partitioned regrouping reduces the rate of A/D conversions, further improving the overall efficiency of the design.With suchmathematical reformulation and its switched-capacitorimplementation, we define one possible 3D-stacked microarchitecture, dubbed BiHiwe1, that leverages clustering and hierarchicaldesign to best utilize power-efficiency of the mixed-signal domainand 3D stacking. We also build models for noise, computational nonidealities, and variations. For ten DNN benchmarks, BiHiwe delivers5.5×speedup over a leading purely-digital 3D-stacked accelerator Tetris, with a mere of less than 0.5% accuracy loss achieved bycareful treatment of noise, computation error, and various forms ofvariation. Compared to RTX 2080 TI with tensor cores and Titan XpGPUs, all with 8-bit execution, BiHiwe offers 35.4×and 70.1×higherPerformance-per-Watt, respectively. Relative to the mixed-signalRedEye, ISAAC, and PipeLayer, BiHiwe offers 5.5×, 3.6×, and 9.6×improvement in Performance-per-Watt respectively. The results suggest that BiHiwe is an effective initial step in a road that combinesmathematics, circuits, and architecture.

Original languageEnglish (US)
Title of host publicationPACT 2020 - Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages399-411
Number of pages13
ISBN (Electronic)9781450380751
DOIs
StatePublished - Sep 30 2020
Event2020 ACM International Conference on Parallel Architectures and Compilation Techniques, PACT 2020 - Virtual, Online, United States
Duration: Oct 3 2020Oct 7 2020

Publication series

NameParallel Architectures and Compilation Techniques - Conference Proceedings, PACT
ISSN (Print)1089-795X

Conference

Conference2020 ACM International Conference on Parallel Architectures and Compilation Techniques, PACT 2020
Country/TerritoryUnited States
CityVirtual, Online
Period10/3/2010/7/20

Keywords

  • Accelerators
  • Analog Error Modeling
  • Analog/Mixed-Signal Computing
  • BitPartitioning
  • DNN
  • DNN Acceleration
  • Deep Neural Networks
  • Mixed-Signal Acceleration
  • Spatial Bit-Level Regrouping

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Mixed-signal charge-domain acceleration of deep neural networks through interleaved bit-partitioned arithmetic'. Together they form a unique fingerprint.

Cite this