Data layout transformation exploiting memory-level parallelism in structured grid many-core applications

I. Jui Sung, John A. Stratton, Wen Mei W. Hwu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present automatic data layout transformation as an effective compiler performance optimization for memory-bound structured grid applications. Structured grid applications include stencil codes and other code structures using a dense, regular grid as the primary data structure. Fluid dynamics and heat distribution, which both solve partial differential equations on a discretized representation of space, are representative of many important structured grid applications. Using the information available through variable-length array syntax, standardized in C99 and other modern languages, we have enabled automatic data layout transformations for structured grid codes with dynamically allocated arrays. We also present how a tool can guide these transformations to statically choose a good layout given a model of the memory system, using a modern GPU as an example. A transformed layout that distributes concurrent memory requests among parallel memory system components provides substantial speedup for structured grid applications by improving their achieved memory-level parallelism. Even with the overhead of more complex address calculations, we observe up to 560% performance increases over the language-defined layout, and a 7% performance gain in the worst case, in which the language-defined layout and access pattern is already well-vectorizable by the underlying hardware.

Original languageEnglish (US)
Title of host publicationPACT'10 - Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages513-522
Number of pages10
ISBN (Print)9781450301787
DOIs
StatePublished - 2010
Event19th International Conference on Parallel Architectures and Compilation Techniques, PACT 2010 - Vienna, Austria
Duration: Sep 11 2010Sep 15 2010

Publication series

NameParallel Architectures and Compilation Techniques - Conference Proceedings, PACT
ISSN (Print)1089-795X

Conference

Conference19th International Conference on Parallel Architectures and Compilation Techniques, PACT 2010
Country/TerritoryAustria
CityVienna
Period9/11/109/15/10

Keywords

  • GPU
  • data layout transformation
  • parallel programming

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Data layout transformation exploiting memory-level parallelism in structured grid many-core applications'. Together they form a unique fingerprint.

Cite this