Abstract
Hardware-based speculative parallelization of non-analyzable codes on distributed modulo scheduling multiprocessors is challenging. A scheme to parallelize codes that have a modest number of cross-iteration dependences is proposed. Simulation results suggest that the scheme is promising: a 16-processor parallel execution of 4 important loops runs 4.2 and 31 times faster than two different serial executions of the loops.
Original language | English (US) |
---|---|
Pages | 135-139 |
Number of pages | 5 |
State | Published - 1999 |
Event | Proceedings of the 1999 5th International Symposium on High-Performance Computer Architecture, HPCA - Orlando, FL, USA Duration: Jan 9 1999 → Jan 13 1999 |
Other
Other | Proceedings of the 1999 5th International Symposium on High-Performance Computer Architecture, HPCA |
---|---|
City | Orlando, FL, USA |
Period | 1/9/99 → 1/13/99 |
ASJC Scopus subject areas
- Hardware and Architecture