TY - GEN
T1 - Asynchronous nested parallelism for dynamic applications in distributed memory
AU - Papadopoulos, Ioannis
AU - Thomas, Nathan
AU - Fidel, Adam
AU - Hoxha, Dielli
AU - Amato, Nancy Marie
AU - Rauchwerger, Lawrence
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2016.
PY - 2016
Y1 - 2016
N2 - Nested parallelism is of increasing interest for both expressivity and performance. Many problems are naturally expressed with this divide-and-conquer software design approach. In addition, programmers with target architecture knowledge employ nested parallelism for performance, imposing a hierarchy in the application to increase locality and resource utilization, often at the cost of implementation complexity. While dynamic applications are a natural fit for the approach, support for nested parallelism in distributed systems is generally limited to well-structured applications engineered with distinct phases of intranode computation and inter-node communication. This model makes expressing irregular applications difficult and also hurts performance by introducing unnecessary latency and synchronizations. In this paper we describe an approach to asynchronous nested parallelism which provides uniform treatment of nested computation across distributed memory. This approach allows efficient execution while supporting dynamic applications which cannot be mapped onto the machine in the rigid manner of regular applications. We use several graph algorithms as examples to demonstrate our library’s expressivity, flexibility, and performance.
AB - Nested parallelism is of increasing interest for both expressivity and performance. Many problems are naturally expressed with this divide-and-conquer software design approach. In addition, programmers with target architecture knowledge employ nested parallelism for performance, imposing a hierarchy in the application to increase locality and resource utilization, often at the cost of implementation complexity. While dynamic applications are a natural fit for the approach, support for nested parallelism in distributed systems is generally limited to well-structured applications engineered with distinct phases of intranode computation and inter-node communication. This model makes expressing irregular applications difficult and also hurts performance by introducing unnecessary latency and synchronizations. In this paper we describe an approach to asynchronous nested parallelism which provides uniform treatment of nested computation across distributed memory. This approach allows efficient execution while supporting dynamic applications which cannot be mapped onto the machine in the rigid manner of regular applications. We use several graph algorithms as examples to demonstrate our library’s expressivity, flexibility, and performance.
KW - Asynchronous
KW - Dynamic applications
KW - Graph
KW - Isolation
KW - Nested parallelism
UR - http://www.scopus.com/inward/record.url?scp=84961189464&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84961189464&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-29778-1_7
DO - 10.1007/978-3-319-29778-1_7
M3 - Conference contribution
AN - SCOPUS:84961189464
SN - 9783319297774
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 106
EP - 121
BT - Languages and Compilers for Parallel Computing - 28th International Workshop, LCPC 2015, Revised Selected Papers
A2 - Shen, Xipeng
A2 - Mueller, Frank
A2 - Tuck, James
PB - Springer
T2 - 28th International Workshop on Languages and Compilers for Parallel Computing, LCPC 2015
Y2 - 9 September 2015 through 11 September 2015
ER -