Improved Superblock optimization in GCC

Robert Kidd, Wen-Mei W Hwu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Superblock scheduling is a common technique to increase the level of instruction level parallelism (ILP) in generated code. Compared to a basic block, the Superblock gives an optimizer or scheduler a longer range over which instructions can be moved. The bookkeeping necessary to execute that move is less than would be necessary inside an arbitrary trace region. Additionally, the process of forming Superblocks generates more instructions that are eligible for movement. These factors combine to produce a significant increase in the ILP in a section of code. By identifying the key feature of Superblock formation that allows this increase in ILP, we can generalize the concept to describe a class of similar optimizations. We refer to techniques in this class as structural techniques. Combining several optimizations in this class with aggressive classical optimization has been shown in the OpenIMPACT compiler to be particularly useful in developing ILP when compiling for the Itanium processor. As a motivation for our work, we present an investigation into the value of structural compilation in the OpenIMPACT compiler. In this domain, structural techniques have been credited with a 10% to 13% increase in code performance over a compiler that implements only classical optimizations. As a first step toward developing structural compilation techniques in GCC, we implemented Superblock formation at the Tree-SSA level. By performing structural transformations early, we give the compiler's high level optimizers an opportunity to specialize the transformed program, thereby cultivating higher levels of ILP. The early results of this modification are mixed, with some benchmarks improving and others slowing. In this paper, we present details on our implementation and study the effects of this structural transformation on later optimizations. Through this, we hope to motivate future work to implement and improve optimizations that can take advantage of the transformed control flow.

Original languageEnglish (US)
Title of host publicationProceedings of the GCC Developers' Summit 2006
Pages85-96
Number of pages12
StatePublished - 2006
EventGCC and GNU Toolchain Developers' Summit 2006 - Ottawa, ON, Canada
Duration: Jun 28 2006Jun 30 2006

Other

OtherGCC and GNU Toolchain Developers' Summit 2006
CountryCanada
CityOttawa, ON
Period6/28/066/30/06

Fingerprint

Flow control
Scheduling

ASJC Scopus subject areas

  • Software

Cite this

Kidd, R., & Hwu, W-M. W. (2006). Improved Superblock optimization in GCC. In Proceedings of the GCC Developers' Summit 2006 (pp. 85-96)

Improved Superblock optimization in GCC. / Kidd, Robert; Hwu, Wen-Mei W.

Proceedings of the GCC Developers' Summit 2006. 2006. p. 85-96.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kidd, R & Hwu, W-MW 2006, Improved Superblock optimization in GCC. in Proceedings of the GCC Developers' Summit 2006. pp. 85-96, GCC and GNU Toolchain Developers' Summit 2006, Ottawa, ON, Canada, 6/28/06.
Kidd R, Hwu W-MW. Improved Superblock optimization in GCC. In Proceedings of the GCC Developers' Summit 2006. 2006. p. 85-96
Kidd, Robert ; Hwu, Wen-Mei W. / Improved Superblock optimization in GCC. Proceedings of the GCC Developers' Summit 2006. 2006. pp. 85-96
@inproceedings{00c743efdf10458cb7b7db5a2f209bcc,
title = "Improved Superblock optimization in GCC",
abstract = "Superblock scheduling is a common technique to increase the level of instruction level parallelism (ILP) in generated code. Compared to a basic block, the Superblock gives an optimizer or scheduler a longer range over which instructions can be moved. The bookkeeping necessary to execute that move is less than would be necessary inside an arbitrary trace region. Additionally, the process of forming Superblocks generates more instructions that are eligible for movement. These factors combine to produce a significant increase in the ILP in a section of code. By identifying the key feature of Superblock formation that allows this increase in ILP, we can generalize the concept to describe a class of similar optimizations. We refer to techniques in this class as structural techniques. Combining several optimizations in this class with aggressive classical optimization has been shown in the OpenIMPACT compiler to be particularly useful in developing ILP when compiling for the Itanium processor. As a motivation for our work, we present an investigation into the value of structural compilation in the OpenIMPACT compiler. In this domain, structural techniques have been credited with a 10{\%} to 13{\%} increase in code performance over a compiler that implements only classical optimizations. As a first step toward developing structural compilation techniques in GCC, we implemented Superblock formation at the Tree-SSA level. By performing structural transformations early, we give the compiler's high level optimizers an opportunity to specialize the transformed program, thereby cultivating higher levels of ILP. The early results of this modification are mixed, with some benchmarks improving and others slowing. In this paper, we present details on our implementation and study the effects of this structural transformation on later optimizations. Through this, we hope to motivate future work to implement and improve optimizations that can take advantage of the transformed control flow.",
author = "Robert Kidd and Hwu, {Wen-Mei W}",
year = "2006",
language = "English (US)",
pages = "85--96",
booktitle = "Proceedings of the GCC Developers' Summit 2006",

}

TY - GEN

T1 - Improved Superblock optimization in GCC

AU - Kidd, Robert

AU - Hwu, Wen-Mei W

PY - 2006

Y1 - 2006

N2 - Superblock scheduling is a common technique to increase the level of instruction level parallelism (ILP) in generated code. Compared to a basic block, the Superblock gives an optimizer or scheduler a longer range over which instructions can be moved. The bookkeeping necessary to execute that move is less than would be necessary inside an arbitrary trace region. Additionally, the process of forming Superblocks generates more instructions that are eligible for movement. These factors combine to produce a significant increase in the ILP in a section of code. By identifying the key feature of Superblock formation that allows this increase in ILP, we can generalize the concept to describe a class of similar optimizations. We refer to techniques in this class as structural techniques. Combining several optimizations in this class with aggressive classical optimization has been shown in the OpenIMPACT compiler to be particularly useful in developing ILP when compiling for the Itanium processor. As a motivation for our work, we present an investigation into the value of structural compilation in the OpenIMPACT compiler. In this domain, structural techniques have been credited with a 10% to 13% increase in code performance over a compiler that implements only classical optimizations. As a first step toward developing structural compilation techniques in GCC, we implemented Superblock formation at the Tree-SSA level. By performing structural transformations early, we give the compiler's high level optimizers an opportunity to specialize the transformed program, thereby cultivating higher levels of ILP. The early results of this modification are mixed, with some benchmarks improving and others slowing. In this paper, we present details on our implementation and study the effects of this structural transformation on later optimizations. Through this, we hope to motivate future work to implement and improve optimizations that can take advantage of the transformed control flow.

AB - Superblock scheduling is a common technique to increase the level of instruction level parallelism (ILP) in generated code. Compared to a basic block, the Superblock gives an optimizer or scheduler a longer range over which instructions can be moved. The bookkeeping necessary to execute that move is less than would be necessary inside an arbitrary trace region. Additionally, the process of forming Superblocks generates more instructions that are eligible for movement. These factors combine to produce a significant increase in the ILP in a section of code. By identifying the key feature of Superblock formation that allows this increase in ILP, we can generalize the concept to describe a class of similar optimizations. We refer to techniques in this class as structural techniques. Combining several optimizations in this class with aggressive classical optimization has been shown in the OpenIMPACT compiler to be particularly useful in developing ILP when compiling for the Itanium processor. As a motivation for our work, we present an investigation into the value of structural compilation in the OpenIMPACT compiler. In this domain, structural techniques have been credited with a 10% to 13% increase in code performance over a compiler that implements only classical optimizations. As a first step toward developing structural compilation techniques in GCC, we implemented Superblock formation at the Tree-SSA level. By performing structural transformations early, we give the compiler's high level optimizers an opportunity to specialize the transformed program, thereby cultivating higher levels of ILP. The early results of this modification are mixed, with some benchmarks improving and others slowing. In this paper, we present details on our implementation and study the effects of this structural transformation on later optimizations. Through this, we hope to motivate future work to implement and improve optimizations that can take advantage of the transformed control flow.

UR - http://www.scopus.com/inward/record.url?scp=84871260238&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84871260238&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84871260238

SP - 85

EP - 96

BT - Proceedings of the GCC Developers' Summit 2006

ER -