Abstract
In current computer architectures, the communication performance between threads varies depending on the memory hierarchy. This performance difference must be considered when mapping parallel applications to processor cores. In parallel applications based on the shared memory paradigm, the communication is difficult to detect because it is implicit. Furthermore, dynamic mapping introduces several challenges, since it needs to find a suitable mapping and migrate the threads with a low overhead during the execution of the application. We propose a mechanism to detect the communication pattern of shared memory applications by monitoring cache coherence protocols. We also propose heuristics that, combined with our communication detection mechanism, allow the mapping to be performed dynamically by the operating system. Experiments with the NAS Parallel Benchmarks showed a reduction of up to 13.9% of the execution time, 30.5% of the cache misses and 39.4% of the number of invalidation messages.
Original language | English (US) |
---|---|
Pages (from-to) | 2215-2228 |
Number of pages | 14 |
Journal | Journal of Parallel and Distributed Computing |
Volume | 74 |
Issue number | 3 |
DOIs | |
State | Published - Mar 2014 |
Externally published | Yes |
Keywords
- Cache coherence protocols
- Communication pattern
- Parallel applications
- Shared memory
- Thread communication
- Thread mapping
ASJC Scopus subject areas
- Software
- Theoretical Computer Science
- Hardware and Architecture
- Computer Networks and Communications
- Artificial Intelligence