TY - GEN
T1 - NDA
T2 - 2015 21st IEEE International Symposium on High Performance Computer Architecture, HPCA 2015
AU - Farmahini-Farahani, Amin
AU - Ahn, Jung Ho
AU - Morrow, Katherine
AU - Kim, Nam Sung
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/3/6
Y1 - 2015/3/6
N2 - Energy consumed for transferring data across the processor memory hierarchy constitutes a large fraction of total system energy consumption, and this fraction has steadily increased with technology scaling. In this paper, we propose near-DRAM acceleration (NDA) architectures, which process data using accelerators 3D-stacked on DRAM devices comprising off-chip main memory modules. NDA transfers most data through high-bandwidth and low-energy 3D interconnects between accelerators and DRAM devices instead of low-bandwidth and high-energy off-chip interconnects between a processor and DRAM devices, substantially reducing energy consumption and improving performance. Unlike previous near-memory processing architectures, NDA is built upon commodity DRAM devices; apart from inserting through-silicon vias (TSVs) to 3D-interconnect DRAM devices and accelerators, NDA requires minimal changes to the commodity DRAM device and standard memory module architectures. This allows NDA to be more easily adopted in both existing and emerging systems. Our experiments demonstrate that, on average, our NDA-based system consumes 46% (68%) lower (data transfer) energy at 1.67× higher performance than a system that integrates the same accelerator logic within the processor itself.
AB - Energy consumed for transferring data across the processor memory hierarchy constitutes a large fraction of total system energy consumption, and this fraction has steadily increased with technology scaling. In this paper, we propose near-DRAM acceleration (NDA) architectures, which process data using accelerators 3D-stacked on DRAM devices comprising off-chip main memory modules. NDA transfers most data through high-bandwidth and low-energy 3D interconnects between accelerators and DRAM devices instead of low-bandwidth and high-energy off-chip interconnects between a processor and DRAM devices, substantially reducing energy consumption and improving performance. Unlike previous near-memory processing architectures, NDA is built upon commodity DRAM devices; apart from inserting through-silicon vias (TSVs) to 3D-interconnect DRAM devices and accelerators, NDA requires minimal changes to the commodity DRAM device and standard memory module architectures. This allows NDA to be more easily adopted in both existing and emerging systems. Our experiments demonstrate that, on average, our NDA-based system consumes 46% (68%) lower (data transfer) energy at 1.67× higher performance than a system that integrates the same accelerator logic within the processor itself.
UR - http://www.scopus.com/inward/record.url?scp=84934280905&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84934280905&partnerID=8YFLogxK
U2 - 10.1109/HPCA.2015.7056040
DO - 10.1109/HPCA.2015.7056040
M3 - Conference contribution
AN - SCOPUS:84934280905
T3 - 2015 IEEE 21st International Symposium on High Performance Computer Architecture, HPCA 2015
SP - 283
EP - 295
BT - 2015 IEEE 21st International Symposium on High Performance Computer Architecture, HPCA 2015
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 7 February 2015 through 11 February 2015
ER -