TY - GEN
T1 - Architecting and Programming a Hardware-Incoherent Multiprocessor Cache Hierarchy
AU - Kim, Wooil
AU - Tavarageri, Sanket
AU - Sadayappan, P.
AU - Torrellas, Josep
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/7/18
Y1 - 2016/7/18
N2 - New architectures for extreme-scale computing need to bedesigned for higher energy efficiency than current systems. One recently-proposed extreme-scale manycore radically simplifiesthe architecture, and proposes a cluster-based on-chip memory hierarchy withouthardware cache coherence. To program for such an environment, this paper proposes twoapproaches. They use shared-memory programmingeither inside clusters only, or both inside and across clusters. Both approaches rely on ISA support for writeback and self-invalidation operations. Our simulation results show thathardware-incoherent cache hierarchies with our support deliverreasonable performance for applications that were notwritten for such hierarchies. Specifically, forexecution within a cluster, the averageexecution time of the applications is 2% higher than with hardware cache coherence, for execution across multiple clusters, it is 5% higher than with hardware cache coherence. This is accomplished with minimal hardware support.
AB - New architectures for extreme-scale computing need to bedesigned for higher energy efficiency than current systems. One recently-proposed extreme-scale manycore radically simplifiesthe architecture, and proposes a cluster-based on-chip memory hierarchy withouthardware cache coherence. To program for such an environment, this paper proposes twoapproaches. They use shared-memory programmingeither inside clusters only, or both inside and across clusters. Both approaches rely on ISA support for writeback and self-invalidation operations. Our simulation results show thathardware-incoherent cache hierarchies with our support deliverreasonable performance for applications that were notwritten for such hierarchies. Specifically, forexecution within a cluster, the averageexecution time of the applications is 2% higher than with hardware cache coherence, for execution across multiple clusters, it is 5% higher than with hardware cache coherence. This is accomplished with minimal hardware support.
KW - Cache coherence
KW - Hardware-incoherent caches
KW - Software-managed caches
UR - http://www.scopus.com/inward/record.url?scp=84983288503&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84983288503&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2016.76
DO - 10.1109/IPDPS.2016.76
M3 - Conference contribution
AN - SCOPUS:84983288503
T3 - Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016
SP - 555
EP - 565
BT - Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 30th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2016
Y2 - 23 May 2016 through 27 May 2016
ER -