Aquabolt-XL HBM2-PIM, LPDDR5-PIM With In-Memory Processing, and AXDIMM With Acceleration Buffer

Jin Hyun Kim, Shin Haeng Kang, Sukhan Lee, Hyeonsu Kim, Yuhwan Ro, Seungwon Lee, David Wang, Jihyun Choi, Jinin So, Yeon Gon Cho, Joon Ho Song, Jeonghyeon Cho, Kyomin Sohn, Nam Sung Kim

Research output: Contribution to journalArticlepeer-review


Processing-in-memory (PIM) has been proposed to improve the performance of bandwidth-intensive workloads as well as save energy due to reduced compute-memory data movement. To realize PIM, programmable computing units were integrated with memory cores on an HBM2 device to enable parallel processing and minimize data movement. A graphics processing unit (GPU) system equipped with Samsung Aquabolt-XL HBM2-PIM devices improved microkernel general matrix-vector multiplication and speech recognition applications by 8.9× and 3.5×, respectively, and reduced energy consumption by over 60%. In a Xilinx AlveoU280 system, microkernel GEMV and ADD workload performances improved by 2.8×, and long short-term memory workload improved by 2.54×. Simulations show that a performance gain of over 2.3× may be attained in a system with LP5-PIM for certain transformer-based speech recognition with an energy reduction of 86%. In addition, AXDIMM, a DIMM-level PIM with acceleration buffers, exhibits an 80% performance improvement and a 42.6% energy savings over a regular RDIMM system.

Original languageEnglish (US)
Pages (from-to)20-30
Number of pages11
JournalIEEE Micro
Issue number3
StatePublished - 2022


  • HBM2
  • LPDDR5
  • PIM
  • Processing-In-Memory

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Electrical and Electronic Engineering


Dive into the research topics of 'Aquabolt-XL HBM2-PIM, LPDDR5-PIM With In-Memory Processing, and AXDIMM With Acceleration Buffer'. Together they form a unique fingerprint.

Cite this