Abstract
A platform that supported Sequential Consistency (SC) for all codes - - not only the well-synchronized ones - - would simplify the task of programmers. Recently, several hardware architectures that support high-performance SC by committing groups of instructions at a time have been proposed. However, for a platform to support SC, it is insufficient that the hardware does; the compiler has to support SC as well. This paper presents the hardware-compiler interface, and the main compiler ideas for BulkCompiler, a simple compiler layer that works with the group-committing hardware to provide a whole-system high-performance SC platform. We introduce ISA primitives and software algorithms for BulkCompiler to drive instruction-group formation, and to transform code to exploit the groups. Our simulation results show that BulkCompiler not only enables a whole-system SC environment, but also one that actually outperforms a conventional platform that uses the more relaxed Java Memory Model by an average of 37%. The speedups come from code optimization inside software-assembled instruction groups.
Original language | English (US) |
---|---|
Pages (from-to) | 133-144 |
Number of pages | 12 |
Journal | Proceedings of the Annual International Symposium on Microarchitecture, MICRO |
DOIs | |
State | Published - 2009 |
Event | 42nd Annual IEEE/ACM International Symposium on Microarchitecture, Micro-42 - New York, NY, United States Duration: Dec 12 2009 → Dec 16 2009 |
Keywords
- Atomic region
- Chunk-based architecture
- Compiler optimization
- Sequential consistency
ASJC Scopus subject areas
- Hardware and Architecture