ClMPI: An opencl extension for interoperation with the message passing interface

Hiroyuki Takizawa, Makoto Sugawara, Shoichi Hirasawa, Isaac Gelado, Hiroaki Kobayashi, Wen-Mei W Hwu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper proposes an OpenCL extension, clMPI, that allows a programmer to think as if GPUs communicate without any help of CPUs. The clMPI extension offers some OpenCL commands of inter-node data transfers that are executed in the same manner as the other OpenCL commands. Thus, clMPI naturally extends the conventional OpenCL programming model so as to improve the MPI interoperability. Unlike conventional joint programming of MPI and OpenCL, CPUs do not need to be blocked to serialize dependent operations of MPI and OpenCL. Hence, an application can easily use the opportunities to overlap parallel activities of CPUs and GPUs. In addition, the implementation details of data transfers are hidden behind the extension, and application programmers can use the optimized data transfers without any tricky programming techniques. As a result, the extension can improve not only the performance but also the performance portability across different system configurations. The evaluation results show that the clMPI extension can use the optimized data transfer implementation and thereby increase the sustained performance by about 14% for the Himeno benchmark if the communication time cannot be overlapped with the computation time.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013
PublisherIEEE Computer Society
Pages1138-1148
Number of pages11
ISBN (Print)9780769549798
DOIs
StatePublished - Jan 1 2013
Event2013 IEEE 37th Annual Computer Software and Applications Conference, COMPSAC 2013 - Boston, MA, Japan
Duration: Jul 22 2013Jul 26 2013

Publication series

NameProceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013

Other

Other2013 IEEE 37th Annual Computer Software and Applications Conference, COMPSAC 2013
CountryJapan
CityBoston, MA
Period7/22/137/26/13

Fingerprint

Message Passing Interface
Message passing
Data transfer
Data Transfer
Program processors
Programming
Computer programming
Interoperability
Portability
Programming Model
Overlap
Communication
Benchmark
Configuration
Dependent
Evaluation
Vertex of a graph
Graphics processing unit

Keywords

  • MPI interoperability
  • OpenCL extension
  • clMPI

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Software
  • Theoretical Computer Science

Cite this

Takizawa, H., Sugawara, M., Hirasawa, S., Gelado, I., Kobayashi, H., & Hwu, W-M. W. (2013). ClMPI: An opencl extension for interoperation with the message passing interface. In Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013 (pp. 1138-1148). [6651000] (Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013). IEEE Computer Society. https://doi.org/10.1109/IPDPSW.2013.183

ClMPI : An opencl extension for interoperation with the message passing interface. / Takizawa, Hiroyuki; Sugawara, Makoto; Hirasawa, Shoichi; Gelado, Isaac; Kobayashi, Hiroaki; Hwu, Wen-Mei W.

Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013. IEEE Computer Society, 2013. p. 1138-1148 6651000 (Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Takizawa, H, Sugawara, M, Hirasawa, S, Gelado, I, Kobayashi, H & Hwu, W-MW 2013, ClMPI: An opencl extension for interoperation with the message passing interface. in Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013., 6651000, Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013, IEEE Computer Society, pp. 1138-1148, 2013 IEEE 37th Annual Computer Software and Applications Conference, COMPSAC 2013, Boston, MA, Japan, 7/22/13. https://doi.org/10.1109/IPDPSW.2013.183
Takizawa H, Sugawara M, Hirasawa S, Gelado I, Kobayashi H, Hwu W-MW. ClMPI: An opencl extension for interoperation with the message passing interface. In Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013. IEEE Computer Society. 2013. p. 1138-1148. 6651000. (Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013). https://doi.org/10.1109/IPDPSW.2013.183
Takizawa, Hiroyuki ; Sugawara, Makoto ; Hirasawa, Shoichi ; Gelado, Isaac ; Kobayashi, Hiroaki ; Hwu, Wen-Mei W. / ClMPI : An opencl extension for interoperation with the message passing interface. Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013. IEEE Computer Society, 2013. pp. 1138-1148 (Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013).
@inproceedings{564714fa37a54cdeb732a515fd919e37,
title = "ClMPI: An opencl extension for interoperation with the message passing interface",
abstract = "This paper proposes an OpenCL extension, clMPI, that allows a programmer to think as if GPUs communicate without any help of CPUs. The clMPI extension offers some OpenCL commands of inter-node data transfers that are executed in the same manner as the other OpenCL commands. Thus, clMPI naturally extends the conventional OpenCL programming model so as to improve the MPI interoperability. Unlike conventional joint programming of MPI and OpenCL, CPUs do not need to be blocked to serialize dependent operations of MPI and OpenCL. Hence, an application can easily use the opportunities to overlap parallel activities of CPUs and GPUs. In addition, the implementation details of data transfers are hidden behind the extension, and application programmers can use the optimized data transfers without any tricky programming techniques. As a result, the extension can improve not only the performance but also the performance portability across different system configurations. The evaluation results show that the clMPI extension can use the optimized data transfer implementation and thereby increase the sustained performance by about 14{\%} for the Himeno benchmark if the communication time cannot be overlapped with the computation time.",
keywords = "MPI interoperability, OpenCL extension, clMPI",
author = "Hiroyuki Takizawa and Makoto Sugawara and Shoichi Hirasawa and Isaac Gelado and Hiroaki Kobayashi and Hwu, {Wen-Mei W}",
year = "2013",
month = "1",
day = "1",
doi = "10.1109/IPDPSW.2013.183",
language = "English (US)",
isbn = "9780769549798",
series = "Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013",
publisher = "IEEE Computer Society",
pages = "1138--1148",
booktitle = "Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013",

}

TY - GEN

T1 - ClMPI

T2 - An opencl extension for interoperation with the message passing interface

AU - Takizawa, Hiroyuki

AU - Sugawara, Makoto

AU - Hirasawa, Shoichi

AU - Gelado, Isaac

AU - Kobayashi, Hiroaki

AU - Hwu, Wen-Mei W

PY - 2013/1/1

Y1 - 2013/1/1

N2 - This paper proposes an OpenCL extension, clMPI, that allows a programmer to think as if GPUs communicate without any help of CPUs. The clMPI extension offers some OpenCL commands of inter-node data transfers that are executed in the same manner as the other OpenCL commands. Thus, clMPI naturally extends the conventional OpenCL programming model so as to improve the MPI interoperability. Unlike conventional joint programming of MPI and OpenCL, CPUs do not need to be blocked to serialize dependent operations of MPI and OpenCL. Hence, an application can easily use the opportunities to overlap parallel activities of CPUs and GPUs. In addition, the implementation details of data transfers are hidden behind the extension, and application programmers can use the optimized data transfers without any tricky programming techniques. As a result, the extension can improve not only the performance but also the performance portability across different system configurations. The evaluation results show that the clMPI extension can use the optimized data transfer implementation and thereby increase the sustained performance by about 14% for the Himeno benchmark if the communication time cannot be overlapped with the computation time.

AB - This paper proposes an OpenCL extension, clMPI, that allows a programmer to think as if GPUs communicate without any help of CPUs. The clMPI extension offers some OpenCL commands of inter-node data transfers that are executed in the same manner as the other OpenCL commands. Thus, clMPI naturally extends the conventional OpenCL programming model so as to improve the MPI interoperability. Unlike conventional joint programming of MPI and OpenCL, CPUs do not need to be blocked to serialize dependent operations of MPI and OpenCL. Hence, an application can easily use the opportunities to overlap parallel activities of CPUs and GPUs. In addition, the implementation details of data transfers are hidden behind the extension, and application programmers can use the optimized data transfers without any tricky programming techniques. As a result, the extension can improve not only the performance but also the performance portability across different system configurations. The evaluation results show that the clMPI extension can use the optimized data transfer implementation and thereby increase the sustained performance by about 14% for the Himeno benchmark if the communication time cannot be overlapped with the computation time.

KW - MPI interoperability

KW - OpenCL extension

KW - clMPI

UR - http://www.scopus.com/inward/record.url?scp=84899722879&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84899722879&partnerID=8YFLogxK

U2 - 10.1109/IPDPSW.2013.183

DO - 10.1109/IPDPSW.2013.183

M3 - Conference contribution

AN - SCOPUS:84899722879

SN - 9780769549798

T3 - Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013

SP - 1138

EP - 1148

BT - Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013

PB - IEEE Computer Society

ER -