MIMDRAM: An End-To-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data Computing

Geraldo F. Oliveira, Ataberk Olgun, Abdullah Giray Yaglikci, F. Nisa Bostanci, Juan Gomez-Luna, Saugata Ghose, Onur Mutlu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Processing-using-DRAM (PUD) is a processing-in-memory (PIM) approach that uses a DRAM array's massive internal parallelism to execute very-wide (e.g., 16,384-262,144-bit-wide) data-parallel operations, in a single-instruction multiple-data (SIMD) fashion. However, DRAM rows' large and rigid granularity limit the effectiveness and applicability of PUD in three ways. First, since applications have varying degrees of SIMD parallelism (which is often smaller than the DRAM row granularity), PUD execution often leads to underutilization, through-put loss, and energy waste. Second, due to the high area cost of implementing interconnects that connect columns in a wide DRAM row, most PUD architectures are limited to the execution of parallel map operations, where a single operation is performed over equally-sized input and output arrays. Third, the need to feed the wide DRAM row with tens of thousands of data elements combined with the lack of adequate compiler support for PUD systems create a programmability barrier, since programmers need to manually extract SIMD parallelism from an application and map computation to the PUD hardware. Our goal is to design a flexible PUD system that overcomes the limitations caused by the large and rigid granularity of PUD. To this end, we propose MIMDRAM, a hardware/software co-designed PUD system that introduces new mechanisms to allocate and control only the necessary resources for a given PUD operation. The key idea of MIMDRAM is to leverage fine-grained DRAM (i.e., the ability to independently access smaller segments of a large DRAM row) for PUD computation. MIMDRAM exploits this key idea to enable a multiple-instruction multiple-data (MIMD) execution model in each DRAM subarray (and SIMD execution within each DRAM row segment). We evaluate MIMDRAM using twelve real-world applications and 495 multi-programmed application mixes. Our evaluation shows that MIMDRAM provides 34 × the performance, 14.3 × the energy efficiency, 1.7 × the throughput, and 1.3 × the fairness of a state-of-The-Art PUD framework, along with 30.6 × and 6.8 × the energy efficiency of a high-end CPU and GPU, respectively. MIMDRAM adds small area cost to a DRAM chip (1.11%) and CPU die (0.6%). We hope and believe that MIMDRAM's ideas and results will help to enable more efficient and easy-To-program PUD systems. To this end, we open source MIMDRAM at https://glthub.com/CMU-SAFARI/MIMDRAM.

Original languageEnglish (US)
Title of host publicationProceedings - 2024 IEEE International Symposium on High-Performance Computer Architecture, HPCA 2024
PublisherIEEE Computer Society
Pages186-203
Number of pages18
ISBN (Electronic)9798350393132
DOIs
StatePublished - 2024
Event30th IEEE International Symposium on High-Performance Computer Architecture, HPCA 2024 - Edinburgh, United Kingdom
Duration: Mar 2 2024Mar 6 2024

Publication series

NameProceedings - International Symposium on High-Performance Computer Architecture
ISSN (Print)1530-0897

Conference

Conference30th IEEE International Symposium on High-Performance Computer Architecture, HPCA 2024
Country/TerritoryUnited Kingdom
CityEdinburgh
Period3/2/243/6/24

Keywords

  • DRAM
  • energy-efficiency
  • hardware/software co-design
  • memory-centric computing
  • processing-in-memory

ASJC Scopus subject areas

  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'MIMDRAM: An End-To-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data Computing'. Together they form a unique fingerprint.

Cite this