TY - GEN
T1 - FReaC cache
T2 - 53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020
AU - Dhar, Ashutosh
AU - Wang, Xiaohao
AU - Franke, Hubertus
AU - Xiong, Jinjun
AU - Huang, Jian
AU - Hwu, Wen Mei
AU - Kim, Nam Sung
AU - Chen, Deming
N1 - Publisher Copyright:
© 2020 IEEE Computer Society. All rights reserved.
PY - 2020/10
Y1 - 2020/10
N2 - The need for higher energy efficiency has resulted in the proliferation of accelerators across platforms, with custom and reconfigurable accelerators adopted in both edge devices and cloud servers. However, existing solutions fall short in providing accelerators with low-latency, high-bandwidth access to the working set and suffer from the high latency and energy cost of data transfers. Such costs can severely limit the smallest granularity of the tasks that can be accelerated and thus the applicability of the accelerators. In this work, we present FReaC Cache, a novel architecture that natively supports reconfigurable computing in the last level cache (LLC), thereby giving energy-efficient accelerators low-latency, high-bandwidth access to the working set. By leveraging the cache's existing dense memory arrays, buses, and logic folding, we construct a reconfigurable fabric in the LLC with minimal changes to the system, processor, cache, and memory architecture. FReaC Cache is a low-latency, low-cost, and low-power alternative to off-die/offchip accelerators, and a flexible, and low-cost alternative to fixed function accelerators. We demonstrate an average speedup of 3X and Perf/W improvements of 6.1X over an edge-class multi-core CPU, and add 3.5% to 15.3% area overhead per cache slice.
AB - The need for higher energy efficiency has resulted in the proliferation of accelerators across platforms, with custom and reconfigurable accelerators adopted in both edge devices and cloud servers. However, existing solutions fall short in providing accelerators with low-latency, high-bandwidth access to the working set and suffer from the high latency and energy cost of data transfers. Such costs can severely limit the smallest granularity of the tasks that can be accelerated and thus the applicability of the accelerators. In this work, we present FReaC Cache, a novel architecture that natively supports reconfigurable computing in the last level cache (LLC), thereby giving energy-efficient accelerators low-latency, high-bandwidth access to the working set. By leveraging the cache's existing dense memory arrays, buses, and logic folding, we construct a reconfigurable fabric in the LLC with minimal changes to the system, processor, cache, and memory architecture. FReaC Cache is a low-latency, low-cost, and low-power alternative to off-die/offchip accelerators, and a flexible, and low-cost alternative to fixed function accelerators. We demonstrate an average speedup of 3X and Perf/W improvements of 6.1X over an edge-class multi-core CPU, and add 3.5% to 15.3% area overhead per cache slice.
KW - In Cache Computing
KW - Logic Folding
KW - Near Memory Acceleration
KW - Reconfigurable Computing
UR - http://www.scopus.com/inward/record.url?scp=85097333897&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85097333897&partnerID=8YFLogxK
U2 - 10.1109/MICRO50266.2020.00021
DO - 10.1109/MICRO50266.2020.00021
M3 - Conference contribution
AN - SCOPUS:85097333897
T3 - Proceedings of the Annual International Symposium on Microarchitecture, MICRO
SP - 102
EP - 117
BT - Proceedings - 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020
PB - IEEE Computer Society
Y2 - 17 October 2020 through 21 October 2020
ER -