Unrolling-based optimizations for modulo scheduling

Daniel M. Lavery, Wen mei W. Hwu

Research output: Contribution to journalConference article

Abstract

Modulo scheduling is a method for overlapping successive iterations of a loop in order to find sufficient instruction-level parallelism to fully utilize high-issue-rate processors. The achieved throughput modulo scheduled loop depends on the resource requirements, the dependence pattern, and the register requirements of the computation in the loop body. Traditionally, unrolling followed by acyclic scheduling of the unrolled body has been an alternative to modulo scheduling. However, there are benefits to unrolling even if the loop is to be modulo scheduled. Unrolling can improve the throughput by allowing a smaller non-integral effective initiation interval to be achieved. After unrolling, optimizations can be applied to the loop that reduce both the resource requirements and the height of the critical paths. Together, unrolling and unrolling-based optimizations can enable the completion of multiple iterations per cycle in some cases. This paper describes the benefits of unrolling and a set of optimizations for unrolled loops which have been implemented in the IMPACT compiler. The performance benefits of unrolling for five of the SPEC CFP92 programs are reported.

Original languageEnglish (US)
Pages (from-to)327-337
Number of pages11
JournalProceedings of the Annual International Symposium on Microarchitecture
StatePublished - Dec 1 1995
EventProceedings of the 1995 28th Annual International Symposium on Microarchitecture - Ann Arbor, MI, USA
Duration: Nov 29 1995Dec 1 1995

    Fingerprint

ASJC Scopus subject areas

  • Hardware and Architecture
  • Software

Cite this