TY - GEN
T1 - Energy-efficient reconfigurable cache architectures for accelerator-enabled embedded systems
AU - Farmahini-Farahani, Amin
AU - Kim, Nam Sung
AU - Morrow, Katherine
PY - 2014
Y1 - 2014
N2 - High-performance embedded systems often include one or more embedded processors tightly coupled with more specialized accelerators. These accelerators improve both performance and energy efficiency because they are specialized for specific (or specific classes of) computations. Data communication between the accelerator and memory, however, is a potential bottleneck for both performance and energy-efficiency. In this paper, we compare and evaluate, for the first time, the impact of L1 data cache design on performance and energy consumption of embedded processor-accelerator systems with shared memory. For this evaluation, we consider data cache design parameters such as size, associativity, and port count, as well as L1 cache sharing between the processor and accelerator. We demonstrate the potential of configurable caches to exploit diversity in cache requirements across hybrid software/hardware applications to significantly improve energy-efficiency while maintaining high performance. Guided by these studies, we propose two techniques for improving energy-efficiency of the cache hierarchy in processor-accelerator systems. The first technique adds configurability to the accelerator-cache interface to allow the accelerator to either share the processor's L1 data cache or use its own private L1 cache. The second technique modifies the L1 cache structure to provide a configurable tradeoff between bandwidth (number of ports) and capacity. Our simulation results show that the first and second techniques improve cache hierarchy energy-efficiency by up to 64% and 33%, respectively, over that of non-configurable caches.
AB - High-performance embedded systems often include one or more embedded processors tightly coupled with more specialized accelerators. These accelerators improve both performance and energy efficiency because they are specialized for specific (or specific classes of) computations. Data communication between the accelerator and memory, however, is a potential bottleneck for both performance and energy-efficiency. In this paper, we compare and evaluate, for the first time, the impact of L1 data cache design on performance and energy consumption of embedded processor-accelerator systems with shared memory. For this evaluation, we consider data cache design parameters such as size, associativity, and port count, as well as L1 cache sharing between the processor and accelerator. We demonstrate the potential of configurable caches to exploit diversity in cache requirements across hybrid software/hardware applications to significantly improve energy-efficiency while maintaining high performance. Guided by these studies, we propose two techniques for improving energy-efficiency of the cache hierarchy in processor-accelerator systems. The first technique adds configurability to the accelerator-cache interface to allow the accelerator to either share the processor's L1 data cache or use its own private L1 cache. The second technique modifies the L1 cache structure to provide a configurable tradeoff between bandwidth (number of ports) and capacity. Our simulation results show that the first and second techniques improve cache hierarchy energy-efficiency by up to 64% and 33%, respectively, over that of non-configurable caches.
UR - http://www.scopus.com/inward/record.url?scp=84904470871&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84904470871&partnerID=8YFLogxK
U2 - 10.1109/ISPASS.2014.6844485
DO - 10.1109/ISPASS.2014.6844485
M3 - Conference contribution
AN - SCOPUS:84904470871
SN - 9781479936052
T3 - ISPASS 2014 - IEEE International Symposium on Performance Analysis of Systems and Software
SP - 211
EP - 220
BT - ISPASS 2014 - IEEE International Symposium on Performance Analysis of Systems and Software
PB - IEEE Computer Society
T2 - 2014 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2014
Y2 - 23 March 2014 through 25 March 2014
ER -