TY - GEN
T1 - Workload adaptive shared memory multicore processors with reconfigurable interconnects
AU - Akram, Shoaib
AU - Kumar, Rakesh
AU - Chen, Deming
N1 - Funding Information:
The authors thank the volunteer involved in this study. We also would like to acknowledge Universiti Tun Hussein Onn Malaysia, Kolej Kemahiran Tinggi MARA Ledang, and Ministry of Education Malaysia for financially supporting this work under RAGS R055.
PY - 2009
Y1 - 2009
N2 - Interconnection networks for multicore processors are designed in a generic way to serve a diversity of workloads. For multicore processors, there is a considerable opportunity to achieve an improvement in performance by implementing interconnects which adapt to different program phases and to a variety of workloads. This paper proposes one such interconnection network for medium-scale (up to 32 cores) shared memory multicore processors and the associated means at the software level to utilize it effectively. The proposed architecture uses clustering to divide the cores on the chip among many groups called clusters. Reconfigurable logic is inserted between clusters to support either isolation or different policies for communication among clusters. The experiments show that the isolation property of clusters can improve overall throughput of a multicore processor by as much as 60% for multiprogramming workloads consisting of two and four applications. The areaoverheadof the additional logic is shown to be minimal.
AB - Interconnection networks for multicore processors are designed in a generic way to serve a diversity of workloads. For multicore processors, there is a considerable opportunity to achieve an improvement in performance by implementing interconnects which adapt to different program phases and to a variety of workloads. This paper proposes one such interconnection network for medium-scale (up to 32 cores) shared memory multicore processors and the associated means at the software level to utilize it effectively. The proposed architecture uses clustering to divide the cores on the chip among many groups called clusters. Reconfigurable logic is inserted between clusters to support either isolation or different policies for communication among clusters. The experiments show that the isolation property of clusters can improve overall throughput of a multicore processor by as much as 60% for multiprogramming workloads consisting of two and four applications. The areaoverheadof the additional logic is shown to be minimal.
UR - http://www.scopus.com/inward/record.url?scp=70350746335&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70350746335&partnerID=8YFLogxK
U2 - 10.1109/SASP.2009.5226329
DO - 10.1109/SASP.2009.5226329
M3 - Conference contribution
AN - SCOPUS:70350746335
SN - 9781424449385
T3 - 2009 IEEE 7th Symposium on Application Specific Processors, SASP 2009
SP - 7
EP - 14
BT - 2009 IEEE 7th Symposium on Application Specific Processors, SASP 2009
T2 - 2009 IEEE 7th Symposium on Application Specific Processors, SASP 2009
Y2 - 27 July 2009 through 28 July 2009
ER -