TY - GEN
T1 - Adaptively mapping code in an intelligent memory architecture
AU - Solihin, Yan
AU - Lee, Jaejin
AU - Torrellas, Josep
N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 2001.
PY - 2001
Y1 - 2001
N2 - This paper presents an algorithm to automatically map code to a generic Processor-In-Memory (PIM) system that consists of a host processor and a much simpler memory processor. To achieve high performance with this type of architecture, code needs to be partitioned and scheduled such that each section is assigned to the processor on which it runs most efficiently. In addition, processors should overlap their execution as much as possible. Our algorithm is embedded in a compiler and run-time system and maps applications fully automatically using both static and dynamic information. Using a set of applications and a simulated architecture, we show average speedups of 1.7 over a single host with plain memory. The speedups are very close and often higher than ideal speedups on a more expensive multiprocessor system composed of two identical host processors. Our work shows that heterogeneity can be cost-effectively exploited, and represents one step toward effectively mapping code to more advanced PIM systems.
AB - This paper presents an algorithm to automatically map code to a generic Processor-In-Memory (PIM) system that consists of a host processor and a much simpler memory processor. To achieve high performance with this type of architecture, code needs to be partitioned and scheduled such that each section is assigned to the processor on which it runs most efficiently. In addition, processors should overlap their execution as much as possible. Our algorithm is embedded in a compiler and run-time system and maps applications fully automatically using both static and dynamic information. Using a set of applications and a simulated architecture, we show average speedups of 1.7 over a single host with plain memory. The speedups are very close and often higher than ideal speedups on a more expensive multiprocessor system composed of two identical host processors. Our work shows that heterogeneity can be cost-effectively exploited, and represents one step toward effectively mapping code to more advanced PIM systems.
UR - http://www.scopus.com/inward/record.url?scp=84945894465&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84945894465&partnerID=8YFLogxK
U2 - 10.1007/3-540-44570-6_5
DO - 10.1007/3-540-44570-6_5
M3 - Conference contribution
AN - SCOPUS:84945894465
SN - 3540424067
SN - 9783540423287
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 71
EP - 84
BT - Intelligent Memory Systems - 2nd International Workshop, IMS 2000, Revised Papers
A2 - Chong, Frederic T.
A2 - Oskin, Mark
A2 - Kozyrakis, Christoforos
PB - Springer
T2 - 2nd International Workshop on Intelligent Memory Systems, IMS 2000
Y2 - 12 November 2000 through 12 November 2000
ER -