Automatic code mapping on an intelligent memory architecture

Yan Solihin, Jaejin Lee, Josep Torrellas

Research output: Contribution to journalArticle

Abstract

This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a high-end host processor and a simpler memory processor. To achieve high performance with this type of architecture, the code needs to be partitioned and scheduled such that each section is assigned to the processor on which it runs most efficiently. In addition, the two processors should overlap their execution as much as possible. With our algorithm, applications are mapped fully automatically using both static and dynamic information. Using a set of standard applications and a simulated architecture, we obtain average speedups of 1.7 for numerical applications and 1.2 for nonnumerical applications over a single host with plain memory. The speedups are very close and often higher than ideal speedups on a more expensive multiprocessor system composed of two identical host processors. Our work shows that heterogeneity can be cost-effectively exploited and represents one step toward effectively mapping code on heterogeneous intelligent memory systems.

Original languageEnglish (US)
Pages (from-to)1248-1266
Number of pages19
JournalIEEE Transactions on Computers
Volume50
Issue number11
DOIs
StatePublished - Nov 2001

Keywords

  • Adaptive execution
  • Compilers
  • Heterogeneous system
  • Intelligent memory architecture
  • Performance prediction
  • Processing-in-memory

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Automatic code mapping on an intelligent memory architecture'. Together they form a unique fingerprint.

  • Cite this