Abstract
Neural networks(NNs) are growing in importance and complexity. An NN's performance (and energy efficiency) can be bound either by computation or memory resources. The (PIM) paradigm, where computation is placed near or within memory arrays, is a viable solution to accelerate memory-bound NNs. However, PIM architectures vary in form, where different PIM approaches lead to different trade-offs. Our goal is to analyze, discuss, and contrast DRAM-based PIM architectures for NN performance and energy efficiency. To do so, we analyze three state-of-the-art PIM architectures: (1) UPMEM, which integrates processors and DRAM arrays into a single 2D chip, (2) Mensa, a 3D-stacking-based PIM architecture tailored for edge devices, and (3) SIMDRAM, which uses the analog principles of DRAM to execute bit-serial operations. Our analysis reveals that PIM greatly benefits memory-bound NNs: (i) UPMEM provides 23× the performance of a high-end GPU when the GPU requires memory oversubscription for a GEMV kernel, (ii) Mensa improves energy efficiency and throughput by 3.0× and 3.1× over the baseline Edge TPU for 24 Google edge NN models, and (iii) SIMDRAM outperforms a CPU/GPU by 16.7×/1.4× for three binary NNs. We conclude that due to their natural limitations, each PIM architecture better suits the execution of NN models with distinct attributes.
Original language | English (US) |
---|---|
Pages (from-to) | 1-14 |
Number of pages | 14 |
Journal | IEEE Micro |
Volume | 42 |
Issue number | 6 |
DOIs | |
State | E-pub ahead of print - Aug 29 2022 |
Keywords
- Analytical models
- Artificial neural networks
- Computational modeling
- Computer architecture
- Energy efficiency
- Random access memory
- Throughput
ASJC Scopus subject areas
- Software
- Hardware and Architecture
- Electrical and Electronic Engineering