Putting the fill unit to work: Dynamic optimizations for trace cache microprocessors

Daniel Holmes Friendly, Sanjay Jeram Patel, Yale N. Patt

Research output: Contribution to journalArticlepeer-review


The fill unit is the structure which collects blocks of instructions and combines them into multi-block segments for storage in a trace cache. In this paper, we expand the role of the fill unit to include four dynamic optimizations: (1) Register move instructions are explicitly marked, enabling them to be executed within the decode logic. (2) Immediate values of dependent instructions are combined, if possible, which removes a step in the dependency chain. (3) Dependent pairs of shift and add instructions are combined into scaled add instructions. (4) Instructions are arranged within the trace segment to minimize the impact of the latency through the operand bypass network. Together, these dynamic trace optimizations improve performance on the SPECint95 benchmarks by more than 17% and over all the benchmarks studied by slightly more than 18%.

Original languageEnglish (US)
Pages (from-to)173-181
Number of pages9
JournalProceedings of the Annual International Symposium on Microarchitecture
StatePublished - 1998
Externally publishedYes

ASJC Scopus subject areas

  • Hardware and Architecture
  • Software


Dive into the research topics of 'Putting the fill unit to work: Dynamic optimizations for trace cache microprocessors'. Together they form a unique fingerprint.

Cite this