Enhancing loop buffering of media and telecommunications applications using low-overhead predication

John W. Sias, Hillery C. Hunter, Wen-Mei W Hwu

Research output: Contribution to journalArticle

Abstract

Media- and telecommunications-focused processors, increasingly designed as deeply pipelined, statically-scheduled VLIWs, rely on loop buffers for low-overhead execution of simple loops. Key loops containing control flow pose a substantial problem-full predication has a high encoding overhead, and partial predication techniques do not support if-conversion, the transformation of general acyclic control flow into predicated blocks. Using a set of significant media processing benchmarks, drawn from Media-Bench and contemporary telecommunications standards, we explore a compromise approach. We demonstrate a compiler using if-conversion and specialized loop transformations to arrange for 70-99% of fetched operations to come from a simple, statically managed 256-instruction loop buffer, saving instruction fetch power and eliminating branch penalties. To complement this we introduce a “niche" form of predication specialized to permit general if-conversion with only a single bit in the encoding of each operation and to eliminate much of the hardware overhead of a predicate register-based approach.

Original languageEnglish (US)
Pages (from-to)262-273
Number of pages12
JournalProceedings of the Annual International Symposium on Microarchitecture
DOIs
StatePublished - Jan 1 2001

Fingerprint

Flow control
Telecommunication
Hardware
Processing

ASJC Scopus subject areas

  • Hardware and Architecture
  • Software

Cite this

@article{266e0cad866946dd8ac64e250744818c,
title = "Enhancing loop buffering of media and telecommunications applications using low-overhead predication",
abstract = "Media- and telecommunications-focused processors, increasingly designed as deeply pipelined, statically-scheduled VLIWs, rely on loop buffers for low-overhead execution of simple loops. Key loops containing control flow pose a substantial problem-full predication has a high encoding overhead, and partial predication techniques do not support if-conversion, the transformation of general acyclic control flow into predicated blocks. Using a set of significant media processing benchmarks, drawn from Media-Bench and contemporary telecommunications standards, we explore a compromise approach. We demonstrate a compiler using if-conversion and specialized loop transformations to arrange for 70-99{\%} of fetched operations to come from a simple, statically managed 256-instruction loop buffer, saving instruction fetch power and eliminating branch penalties. To complement this we introduce a “niche{"} form of predication specialized to permit general if-conversion with only a single bit in the encoding of each operation and to eliminate much of the hardware overhead of a predicate register-based approach.",
author = "Sias, {John W.} and Hunter, {Hillery C.} and Hwu, {Wen-Mei W}",
year = "2001",
month = "1",
day = "1",
doi = "10.1109/MICRO.2001.991124",
language = "English (US)",
pages = "262--273",
journal = "Proceedings of the Annual International Symposium on Microarchitecture, MICRO",
issn = "1072-4451",

}

TY - JOUR

T1 - Enhancing loop buffering of media and telecommunications applications using low-overhead predication

AU - Sias, John W.

AU - Hunter, Hillery C.

AU - Hwu, Wen-Mei W

PY - 2001/1/1

Y1 - 2001/1/1

N2 - Media- and telecommunications-focused processors, increasingly designed as deeply pipelined, statically-scheduled VLIWs, rely on loop buffers for low-overhead execution of simple loops. Key loops containing control flow pose a substantial problem-full predication has a high encoding overhead, and partial predication techniques do not support if-conversion, the transformation of general acyclic control flow into predicated blocks. Using a set of significant media processing benchmarks, drawn from Media-Bench and contemporary telecommunications standards, we explore a compromise approach. We demonstrate a compiler using if-conversion and specialized loop transformations to arrange for 70-99% of fetched operations to come from a simple, statically managed 256-instruction loop buffer, saving instruction fetch power and eliminating branch penalties. To complement this we introduce a “niche" form of predication specialized to permit general if-conversion with only a single bit in the encoding of each operation and to eliminate much of the hardware overhead of a predicate register-based approach.

AB - Media- and telecommunications-focused processors, increasingly designed as deeply pipelined, statically-scheduled VLIWs, rely on loop buffers for low-overhead execution of simple loops. Key loops containing control flow pose a substantial problem-full predication has a high encoding overhead, and partial predication techniques do not support if-conversion, the transformation of general acyclic control flow into predicated blocks. Using a set of significant media processing benchmarks, drawn from Media-Bench and contemporary telecommunications standards, we explore a compromise approach. We demonstrate a compiler using if-conversion and specialized loop transformations to arrange for 70-99% of fetched operations to come from a simple, statically managed 256-instruction loop buffer, saving instruction fetch power and eliminating branch penalties. To complement this we introduce a “niche" form of predication specialized to permit general if-conversion with only a single bit in the encoding of each operation and to eliminate much of the hardware overhead of a predicate register-based approach.

UR - http://www.scopus.com/inward/record.url?scp=0035691557&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0035691557&partnerID=8YFLogxK

U2 - 10.1109/MICRO.2001.991124

DO - 10.1109/MICRO.2001.991124

M3 - Article

AN - SCOPUS:0035691557

SP - 262

EP - 273

JO - Proceedings of the Annual International Symposium on Microarchitecture, MICRO

JF - Proceedings of the Annual International Symposium on Microarchitecture, MICRO

SN - 1072-4451

ER -