Transforming Complex Loop Nests for Locality

Qing Yi, Ken Kennedy, Vikram Adve

Research output: Contribution to journalArticlepeer-review

Abstract

Over the past 20 years, increases in processor speed have dramatically outstripped performance increases for standard memory chips. To bridge this gap, compilers must optimize applications so that data fetched into caches are reused before being displaced. Existing compiler techniques can efficiently optimize simple loop structures such as sequences of perfectly nested loops. However, on more complicated structures, existing techniques are either ineffective or require too much computation time to be practical for a commercial compiler. To optimize complex loop structures both effectively and inexpensively, we present a novel loop transformation, dependence hoisting, for optimizing arbitrarily nested loops, and an efficient framework that applies the new technique to aggressively optimize benchmarks for better locality. Our technique is as inexpensive as the traditional unimodular loop transformation techniques and thus can be incorporated into commercial compilers. In addition, it is highly effective and is able to block several linear algebra kernels containing highly challenging loop structures, in particular, Cholesky, QR, LU factorization without pivoting, and LU with partial pivoting. The automatic blocking of QR and pivoting LU is a notable achievement - to our knowledge, few previous compiler techniques, including theoretically more general loop transformation frameworks [1, 21, 23, 27, 31], were able to completely automate the blocking of these kernels, and none has produced the same blocking as produced by our technique. These results indicate that with low compilation cost, our technique can in practice match the effectiveness of much more expensive frameworks that are theoretically more powerful.

Original languageEnglish (US)
Pages (from-to)219-264
Number of pages46
JournalJournal of Supercomputing
Volume27
Issue number3
DOIs
StatePublished - Mar 2004

Keywords

  • Compiler optimization
  • Complex loop structure
  • Linear algebra kernels
  • Loop transformation

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Information Systems
  • Hardware and Architecture

Fingerprint Dive into the research topics of 'Transforming Complex Loop Nests for Locality'. Together they form a unique fingerprint.

Cite this