TY - GEN
T1 - Evaluating irregular memory access on OpenCL FPGA platforms
T2 - 27th International Conference on Field Programmable Logic and Applications, FPL 2017
AU - Luo, Yingyi
AU - Wen, Xianshan
AU - Yoshii, Kazutomo
AU - Ogrenci-Memik, Seda
AU - Memik, Gokhan
AU - Finkel, Hal
AU - Cappello, Franck
N1 - Funding Information:
ACKNOWLEDGMENTS This work was partially funded by the DOE grant DESC0012531 and the NSF grant CCF-1422489. This material was based upon work supported by the U.S. Department of Energy Office of Science, under contract DE-AC02-06CH11357.
Publisher Copyright:
© 2017 Ghent University.
PY - 2017/10/2
Y1 - 2017/10/2
N2 - FPGAs are becoming an attractive choice as a heterogeneous computing unit for scientific computing because FPGA vendors are adding floating-point-optimized architectures to their product lines. Additionally, high-level synthesis (HLS) tools such as Altera OpenCL SDK are emerging, which could potentially break the FPGA programming wall and provide a streamlined flow for domain experts in scientific computing. On the other hand, providing high performance in the presence of irregular memory access patterns to off-chip memory remains a challenge for the automated synthesis flows. In this paper, we study the performance/energy characteristics of OpenCL-generated FPGA designs on irregular memory access patterns, targeting XSBench, a memory-intensive Monte Carlo simulation code, as a case study. To complete our study, we implement XSBench in OpenCL and study optimization strategies for FPGAs. We observe that our OpenCL implantation of XSBench achieves 50 % higher energy efficiency on an Intel Arria10-based FPGA platform than that on an Intel Xeon 8-core CPU while trading off 35 % of performance.
AB - FPGAs are becoming an attractive choice as a heterogeneous computing unit for scientific computing because FPGA vendors are adding floating-point-optimized architectures to their product lines. Additionally, high-level synthesis (HLS) tools such as Altera OpenCL SDK are emerging, which could potentially break the FPGA programming wall and provide a streamlined flow for domain experts in scientific computing. On the other hand, providing high performance in the presence of irregular memory access patterns to off-chip memory remains a challenge for the automated synthesis flows. In this paper, we study the performance/energy characteristics of OpenCL-generated FPGA designs on irregular memory access patterns, targeting XSBench, a memory-intensive Monte Carlo simulation code, as a case study. To complete our study, we implement XSBench in OpenCL and study optimization strategies for FPGAs. We observe that our OpenCL implantation of XSBench achieves 50 % higher energy efficiency on an Intel Arria10-based FPGA platform than that on an Intel Xeon 8-core CPU while trading off 35 % of performance.
UR - http://www.scopus.com/inward/record.url?scp=85034434007&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85034434007&partnerID=8YFLogxK
U2 - 10.23919/FPL.2017.8056827
DO - 10.23919/FPL.2017.8056827
M3 - Conference contribution
AN - SCOPUS:85034434007
T3 - 2017 27th International Conference on Field Programmable Logic and Applications, FPL 2017
BT - 2017 27th International Conference on Field Programmable Logic and Applications, FPL 2017
A2 - Gohringer, Diana
A2 - Stroobandt, Dirk
A2 - Mentens, Nele
A2 - Santambrogio, Marco
A2 - Nurmi, Jari
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 4 September 2017 through 6 September 2017
ER -