The impact of exploiting instruction-level parallelism on shared-memory multiprocessors

Vijay S. Pai, Parthasarathy Ranganathan, Hazim Abdel-Shafi, Sarita Adve

Research output: Contribution to journalArticlepeer-review

Abstract

Current microprocessors incorporate techniques to aggressively exploit instruction-level parallelism (ILP). This paper evaluates the impact of such processors on the performance of shared-memory multiprocessors, both without and with the latency-hiding optimization of software prefetching. Our results show that, while ILP techniques substantially reduce CPU time in multiprocessors, they are less effective in removing memory stall time. Consequently, despite the inherent latency tolerance features of ILP processors, we find memory system performance to be a larger bottleneck and parallel efficiencies to be generally poorer in ILP-based multiprocessors than in previous generation multiprocessors. The main reasons for these deficiencies are insufficient opportunities in the applications to overlap multiple load misses and increased contention for resources in the system. We also find that software prefetching does not change the memory bound nature of most of our applications on our ILP multiprocessor, mainly due to a large number of late prefetches and resource contention. Our results suggest the need for additional latency hiding or reducing techniques for ILP systems, such as software clustering of load misses and producer-initiated communication.

Original languageEnglish (US)
Pages (from-to)218-226
Number of pages9
JournalIEEE Transactions on Computers
Volume48
Issue number2
DOIs
StatePublished - 1999
Externally publishedYes

Keywords

  • Instruction-level parallelism
  • Performance evaluation
  • Shared-memory multiprocessors
  • Software prefetching

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'The impact of exploiting instruction-level parallelism on shared-memory multiprocessors'. Together they form a unique fingerprint.

Cite this