An advanced compiler framework for non-cache-coherent multiprocessors

Yunheung Paek, Angeles Navarro, Emilio Zapata, Jay Hoeflinger, David Padua

Research output: Contribution to journalArticlepeer-review

Abstract

The Cray T3D and T3E are non-cache-coherent (NCC) computers with a NUMA structure. They have been shown to exhibit a very stable and scalable performance for a variety of application programs. Considerable evidence suggests that they are more stable and scalable than many other shared-memory multiprocessors. However, the principal drawback of these machines is a lack of programmability, caused by the absence of the global cache coherence that is necessary to provide a convenient shared view of memory in hardware. This forces the programmer to keep careful track of where each piece of data is stored, a complication that is unnecessary when a pure shared-memory view is presented to the user. We believe that a remedy for this problem is advanced compiler technology. In this paper, we present our experience with a compiler framework for automatic parallelization and communication generation that has the potential to reduce the time-consuming hand-tuning that would otherwise be necessary to achieve good performance with this type of machine. From our experiments, we learned that our compiler performs well for a variety of applications on the T3D and T3E and we found a few sophisticated techniques that could improve performance even more once they are fully implemented in the compiler.

Original languageEnglish (US)
Pages (from-to)241-259
Number of pages19
JournalIEEE Transactions on Parallel and Distributed Systems
Volume13
Issue number3
DOIs
StatePublished - Mar 2002

Keywords

  • Array privatization
  • Compiler
  • Dependence analysis
  • Multiprocessors
  • Noncoherent caches
  • Put/Get
  • Shared-memory programming

ASJC Scopus subject areas

  • Signal Processing
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'An advanced compiler framework for non-cache-coherent multiprocessors'. Together they form a unique fingerprint.

Cite this