TY - GEN
T1 - Nested parallelism with algorithmic skeletons
AU - Majidi, Alireza
AU - Thomas, Nathan
AU - Smith, Timmie
AU - Amato, Nancy
AU - Rauchwerger, Lawrence
N1 - Funding Information:
This research supported in part by NSF awards CNS-0551685, CCF-1439145, CCF-1423111, IIS-0916053, IIS-0917266, EFRI-1240483, RI-1217991, by NIH NCI R25 CA090301-11, and by DOE awards DE-NA0002376, B575363. This research used resources of the National Energy Research Scientific Computing Center, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.
Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2019
Y1 - 2019
N2 - Nested parallelism is a natural way to express programs for hierarchical systems. It enables a compositional programming approach that can then be mapped onto the system hierarchy. In this paper, we present nested algorithm composition in the STAPL Skeleton Library (SSL) which uses a nested dataflow model as its internal representation. We show how a high level program specification using SSL allows for asynchronous computation and improved locality. We study both the specification and performance of the stapl implementation of Kripke, a mini-app developed by Lawrence Livermore National Laboratory. Kripke has multiple levels of parallelism and a number of data layouts, making it an excellent test bed to exercise the effectiveness of a nested parallel programming approach. Performance results are provided for six different nesting orders of the benchmark demonstrating the flexibility and performance of nested algorithmic skeleton composition in stapl.
AB - Nested parallelism is a natural way to express programs for hierarchical systems. It enables a compositional programming approach that can then be mapped onto the system hierarchy. In this paper, we present nested algorithm composition in the STAPL Skeleton Library (SSL) which uses a nested dataflow model as its internal representation. We show how a high level program specification using SSL allows for asynchronous computation and improved locality. We study both the specification and performance of the stapl implementation of Kripke, a mini-app developed by Lawrence Livermore National Laboratory. Kripke has multiple levels of parallelism and a number of data layouts, making it an excellent test bed to exercise the effectiveness of a nested parallel programming approach. Performance results are provided for six different nesting orders of the benchmark demonstrating the flexibility and performance of nested algorithmic skeleton composition in stapl.
KW - Algorithmic skeletons
KW - Dataflow
KW - Kripke mini-app
KW - Nested parallelism
KW - Sweep algorithm
UR - http://www.scopus.com/inward/record.url?scp=85076318874&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85076318874&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-34627-0_12
DO - 10.1007/978-3-030-34627-0_12
M3 - Conference contribution
AN - SCOPUS:85076318874
SN - 9783030346263
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 159
EP - 175
BT - Languages and Compilers for Parallel Computing - 31st International Workshop, LCPC 2018, Revised Selected Papers
A2 - Hall, Mary
A2 - Sundar, Hari
PB - Springer
T2 - 31st International Workshop on Languages and Compilers for Parallel Computing, LCPC 2018
Y2 - 9 October 2018 through 11 October 2018
ER -