Comparing and combining read miss clustering and software prefetching

V. S. Pai, Sarita V Adve

Research output: Contribution to journalConference articlepeer-review


A recent latency tolerance technique, read miss clustering, restructures code to send demand miss references in parallel to the underlying memory system. An alternate, widely-used latency tolerance technique is software prefetching, which initiates data fetches ahead of expected demand miss references by a certain distance. Since both techniques seem to target the same types of latencies and use the same system resources, it is unclear which technique is superior or if both can be combined. This paper shows that these two techniques are actually mutually beneficial, each helping to overcome limitations of the other. We perform our study for uniprocessor and multiprocessor configurations, in simulation and on a real machine (Convex Exemplar). Compared to prefetching alone (the state-of-the-art implemented in systems today), the combination of the two techniques reduces execution time an average of 21% across all cases studied in simulation and an average of 16% for 5 out of 10 cases on the Exemplar. The combination sees execution time reductions relative to clustering alone averaging 15% for 6 out of 11 cases in simulation and 20% for 6 out of 10 cases on the Exemplar.

Original languageEnglish (US)
Pages (from-to)292-303
Number of pages12
JournalParallel Architectures and Compilation Techniques - Conference Proceedings, PACT
StatePublished - 2001
Externally publishedYes
EventInternatinal Conference on Parallel Architectures and Compilation Techniques PACT 2001 - Barcelona, Spain
Duration: Sep 8 2001Sep 12 2001

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture


Dive into the research topics of 'Comparing and combining read miss clustering and software prefetching'. Together they form a unique fingerprint.

Cite this