In this paper we address the important problem of instruction fetch for future wide issue superscalar processors. Our approach focuses on understanding the interaction between software and hardware techniques targeting an increase in the instruction fetch bandwidth. That is the objective, for instance, of the Hardware Trace Cache (HTC).
We design a profile based code reordering technique which targets a maximization of the sequentially of instructions, while still trying to minimize instruction cache misses. We call our software approach, Software Trace Cache (STC).
We evaluate our software approach, and then compare it with the HTC and the combination of both techniques. Our results show that for large codes with few loops and deterministic execution sequences like databases and some SPEC-INT codes, the STC offers similar, or better, results than a HTC. Moreover, when combining the software and hardware approaches, we obtain encouraging results: the STC and a small HTC offer similar performance to a much larger HTC alone.
|Original language||English (US)|
|Title of host publication||ICS 2014 - Proceedings of the 28th ACM InternationaI Conference on Supercomputing|
|Publisher||Association for Computing Machinery|
|Number of pages||8|
|State||Published - Jun 10 2014|
|Event||25th ACM International Conference on Supercomputing, ICS 2014 - Munich, Germany|
Duration: Jun 10 2014 → Jun 13 2014
|Name||Proceedings of the International Conference on Supercomputing|
|Other||25th ACM International Conference on Supercomputing, ICS 2014|
|Period||6/10/14 → 6/13/14|
ASJC Scopus subject areas
- Computer Science(all)