Abstract
The performance of the data cache in shared- memory multiprocessors has been shown to be different from that in uniprocessors. In particular, cache miss rates in multiprocessors do not show the sharp drop typical of uniprocessors when the size of the cache block increases. The resulting high cache miss rate is a cause of concern, since it can significantly limit the performance of multiprocessors. Some researchers have speculated that this effect is due to false sharing, the coherence transactions that result when different processors update different words of the same cache block in an interleaved fashion. While the analysis of six applications in this paper confirms that false sharing has a significant impact on the miss rate, the measurements also show that poor spatial locality among accesses to shared data has an even larger impact. To mitigate false sharing and to enhance spatial locality, we optimize the layout of shared data in cache blocks in a programmer-transparent manner. We show that this approach can reduce the number of misses on shared data by about 10% on average.
Original language | English (US) |
---|---|
Pages (from-to) | 651-663 |
Number of pages | 13 |
Journal | IEEE Transactions on Computers |
Volume | 43 |
Issue number | 6 |
DOIs | |
State | Published - Jun 1994 |
Keywords
- Multiprocessing
- cache memory
- false sharing
- optimizing compiler
- placement of data
- shared-memory multiproces-
- sharing
- sor
ASJC Scopus subject areas
- Software
- Theoretical Computer Science
- Hardware and Architecture
- Computational Theory and Mathematics