SIMDRAM: A framework for bit-serial SIMD processing using DRAM

Nastaran Hajinazar, Geraldo F. Oliveira, Sven Gregorio, Joaõ Dinis Ferreira, Nika Mansouri Ghiasi, Minesh Patel, Mohammed Alser, Saugata Ghose, Juan Gómez-Luna, Onur Mutlu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Processing-using-DRAM has been proposed for a limited set of basic operations (i.e., logic operations, addition). However, in order to enable full adoption of processing-using-DRAM, it is necessary to provide support for more complex operations. In this paper, we propose SIMDRAM, a flexible general-purpose processing-using-DRAM framework that (1) enables the efficient implementation of complex operations, and (2) provides a flexible mechanism tosupport the implementation of arbitrary user-defined operations. The SIMDRAM framework comprises three key steps. The first step builds an efficient MAJ/NOT representation of a given desired operation. The second step allocates DRAM rows that are reserved for computation to the operation's input and output operands, and generates the required sequence of DRAM commands to perform the MAJ/NOT implementation of the desired operation in DRAM. The third step uses the SIMDRAM control unit located inside the memory controller to manage the computation of the operation from start to end, by executing the DRAM commands generated in the second step of the framework. We design the hardware and ISA support for SIMDRAM framework to (1) address key system integration challenges, and (2) allow programmers to employ new SIMDRAM operations without hardware changes. We evaluate SIMDRAM for reliability, area overhead, throughput, and energy efficiency using a wide range of operations and seven real-world applications to demonstrate SIMDRAM's generality. Our evaluations using a single DRAM bank show that (1) over 16 operations, SIMDRAM provides 2.0X the throughput and 2.6X the energy efficiency of Ambit, a state-of-the-art processing-using-DRAM mechanism; (2) over seven real-world applications, SIMDRAM provides 2.5X the performance of Ambit. Using 16 DRAM banks, SIMDRAM provides (1) 88X and 5.8X the throughput, and 257X and 31X the energy efficiency, of a CPU and a high-end GPU, respectively, over 16 operations; (2) 21X and 2.1X the performance of the CPU and GPU, over seven real-world applications. SIMDRAM incurs an area overhead of only 0.2% in a high-end CPU.

Original languageEnglish (US)
Title of host publicationProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2021
PublisherAssociation for Computing Machinery
Pages329-345
Number of pages17
ISBN (Electronic)9781450383172
DOIs
StatePublished - Apr 19 2021
Event26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2021 - Virtual, Online, United States
Duration: Apr 19 2021Apr 23 2021

Publication series

NameInternational Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS

Conference

Conference26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2021
Country/TerritoryUnited States
CityVirtual, Online
Period4/19/214/23/21

Keywords

  • Bulk Bitwise Operations
  • DRAM
  • Energy
  • Performance
  • Processing-Using-Memory
  • Processing-in-Memory

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'SIMDRAM: A framework for bit-serial SIMD processing using DRAM'. Together they form a unique fingerprint.

Cite this