Memory-side prefetching for linked data structures for processor-in-memory systems

Christopher J. Hughes, Sarita V. Adve

Research output: Contribution to journalArticlepeer-review


This paper studies a memory-side prefetching technique to hide latency incurred by inherently serial accesses to linked data structures (LDS). A programmable engine sits close to memory and traverses LDS independently from the processor. The engine can run ahead of the processor because of its low latency path to memory, allowing it to initiate data transfers earlier than the processor and pipeline multiple transfers over the network. We evaluate the proposed memory-side prefetching scheme for the Olden benchmarks on a processor-in-memory system. For the six benchmarks where LDS memory stall time is significant, the memory-side scheme reduces execution time by an average of 27% compared to a system without any prefetching. Compared to a state-of-the-art processor-side software prefetching scheme, the memory-side scheme reduces execution time in the range of 20-50% for three of the six applications, is about the same for two applications, and is worse by 18% for one application. We conclude that our memory-side scheme is effective, but a combination of the processor- and memory-side prefetching schemes is best and provide a qualitative framework to determine when either scheme should be used.

Original languageEnglish (US)
Pages (from-to)448-463
Number of pages16
JournalJournal of Parallel and Distributed Computing
Issue number4
StatePublished - Apr 2005


  • Linked data structures
  • Prefetching
  • Processor-in-memory

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computer Networks and Communications
  • Artificial Intelligence


Dive into the research topics of 'Memory-side prefetching for linked data structures for processor-in-memory systems'. Together they form a unique fingerprint.

Cite this