Compiler-directed dynamic computation reuse: Rationale and initial results

Daniel A. Connors, Wen-Mei W Hwu

Research output: Contribution to journalConference article

Abstract

Recent studies on value locality reveal that many instructions are frequently executed with a small variety of inputs. This paper proposes an approach that integrates architecture and compiler techniques to exploit value locality for large regions of code. The approach strives to eliminate redundant processor execution created by both instruction-level input repetition and recurrence of input data within high-level computations. In this approach, the compiler performs analysis to identify code regions whose computation can be reused during dynamic execution. The instruction set architecture provides a simple interface for the compiler to communicate the scope of each reuse region and its live-out register information to the hardware. During run time, the execution results of these reusable computation regions are recorded into hardware buffers for potential reuse. Each reuse can eliminate the execution of a large number of dynamic instructions. Furthermore, the actions needed to update the live-out registers can be performed at a higher degree of parallelism than the original code, breaking intrinsic dataflow dependence constraints. Initial results show that the compiler analysis can indeed identify large reuse regions. Overall, the approach can improve the performance of a 6-issue microarchitecture by an average of 30% for a collection of SPEC and integer benchmarks.

Original languageEnglish (US)
Pages (from-to)158-169
Number of pages12
JournalProceedings of the Annual International Symposium on Microarchitecture
StatePublished - Dec 1 1999
EventProceedings of the 1999 32nd Annual ACM/IEEE International Symposium on Microarchitecture, MICRO-32 - Haifa, Isr
Duration: Nov 16 1999Nov 18 1999

Fingerprint

Hardware

ASJC Scopus subject areas

  • Hardware and Architecture
  • Software

Cite this

Compiler-directed dynamic computation reuse : Rationale and initial results. / Connors, Daniel A.; Hwu, Wen-Mei W.

In: Proceedings of the Annual International Symposium on Microarchitecture, 01.12.1999, p. 158-169.

Research output: Contribution to journalConference article

@article{1387008d5f9a4266a408b4deb78a11f2,
title = "Compiler-directed dynamic computation reuse: Rationale and initial results",
abstract = "Recent studies on value locality reveal that many instructions are frequently executed with a small variety of inputs. This paper proposes an approach that integrates architecture and compiler techniques to exploit value locality for large regions of code. The approach strives to eliminate redundant processor execution created by both instruction-level input repetition and recurrence of input data within high-level computations. In this approach, the compiler performs analysis to identify code regions whose computation can be reused during dynamic execution. The instruction set architecture provides a simple interface for the compiler to communicate the scope of each reuse region and its live-out register information to the hardware. During run time, the execution results of these reusable computation regions are recorded into hardware buffers for potential reuse. Each reuse can eliminate the execution of a large number of dynamic instructions. Furthermore, the actions needed to update the live-out registers can be performed at a higher degree of parallelism than the original code, breaking intrinsic dataflow dependence constraints. Initial results show that the compiler analysis can indeed identify large reuse regions. Overall, the approach can improve the performance of a 6-issue microarchitecture by an average of 30{\%} for a collection of SPEC and integer benchmarks.",
author = "Connors, {Daniel A.} and Hwu, {Wen-Mei W}",
year = "1999",
month = "12",
day = "1",
language = "English (US)",
pages = "158--169",
journal = "Proceedings of the Annual International Symposium on Microarchitecture, MICRO",
issn = "1072-4451",

}

TY - JOUR

T1 - Compiler-directed dynamic computation reuse

T2 - Rationale and initial results

AU - Connors, Daniel A.

AU - Hwu, Wen-Mei W

PY - 1999/12/1

Y1 - 1999/12/1

N2 - Recent studies on value locality reveal that many instructions are frequently executed with a small variety of inputs. This paper proposes an approach that integrates architecture and compiler techniques to exploit value locality for large regions of code. The approach strives to eliminate redundant processor execution created by both instruction-level input repetition and recurrence of input data within high-level computations. In this approach, the compiler performs analysis to identify code regions whose computation can be reused during dynamic execution. The instruction set architecture provides a simple interface for the compiler to communicate the scope of each reuse region and its live-out register information to the hardware. During run time, the execution results of these reusable computation regions are recorded into hardware buffers for potential reuse. Each reuse can eliminate the execution of a large number of dynamic instructions. Furthermore, the actions needed to update the live-out registers can be performed at a higher degree of parallelism than the original code, breaking intrinsic dataflow dependence constraints. Initial results show that the compiler analysis can indeed identify large reuse regions. Overall, the approach can improve the performance of a 6-issue microarchitecture by an average of 30% for a collection of SPEC and integer benchmarks.

AB - Recent studies on value locality reveal that many instructions are frequently executed with a small variety of inputs. This paper proposes an approach that integrates architecture and compiler techniques to exploit value locality for large regions of code. The approach strives to eliminate redundant processor execution created by both instruction-level input repetition and recurrence of input data within high-level computations. In this approach, the compiler performs analysis to identify code regions whose computation can be reused during dynamic execution. The instruction set architecture provides a simple interface for the compiler to communicate the scope of each reuse region and its live-out register information to the hardware. During run time, the execution results of these reusable computation regions are recorded into hardware buffers for potential reuse. Each reuse can eliminate the execution of a large number of dynamic instructions. Furthermore, the actions needed to update the live-out registers can be performed at a higher degree of parallelism than the original code, breaking intrinsic dataflow dependence constraints. Initial results show that the compiler analysis can indeed identify large reuse regions. Overall, the approach can improve the performance of a 6-issue microarchitecture by an average of 30% for a collection of SPEC and integer benchmarks.

UR - http://www.scopus.com/inward/record.url?scp=0033316832&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0033316832&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:0033316832

SP - 158

EP - 169

JO - Proceedings of the Annual International Symposium on Microarchitecture, MICRO

JF - Proceedings of the Annual International Symposium on Microarchitecture, MICRO

SN - 1072-4451

ER -