TY - GEN
T1 - Parallel reductions
T2 - 15th Workshop on Languages and Compilers for Parallel Computing, LCPC 2002
AU - Yu, Hao
AU - Dang, Francis
AU - Rauchwerger, Lawrence
PY - 2005
Y1 - 2005
N2 - Irregular and dynamic memory reference patterns can cause significant performance variations for low level algorithms in general and especially for parallel algorithms. We have previously shown that parallel reduction algorithms are quite input sensitive and thus can benefit from an adaptive, reference pattern directed selection. In this paper we extend our previous work by detailing a systematic approach to dynamically select the best parallel algorithm. First we model the characteristics of the input, i.e., the memory reference pattern, with a descriptor vector. Then we measure the performance of several reduction algorithms for various values of the pattern descriptor. Finally we establish a (many-to-one) mapping (function) between a finite set of descriptor values and a set of algorithms. We thus obtain a performance ranking of the available algorithms with respect to a limited set of descriptor values. The actual dynamic selection code is generated using statistical regression methods or a decision tree. Finally we present experimental results to validate our modeling and prediction techniques.
AB - Irregular and dynamic memory reference patterns can cause significant performance variations for low level algorithms in general and especially for parallel algorithms. We have previously shown that parallel reduction algorithms are quite input sensitive and thus can benefit from an adaptive, reference pattern directed selection. In this paper we extend our previous work by detailing a systematic approach to dynamically select the best parallel algorithm. First we model the characteristics of the input, i.e., the memory reference pattern, with a descriptor vector. Then we measure the performance of several reduction algorithms for various values of the pattern descriptor. Finally we establish a (many-to-one) mapping (function) between a finite set of descriptor values and a set of algorithms. We thus obtain a performance ranking of the available algorithms with respect to a limited set of descriptor values. The actual dynamic selection code is generated using statistical regression methods or a decision tree. Finally we present experimental results to validate our modeling and prediction techniques.
UR - http://www.scopus.com/inward/record.url?scp=33745164571&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33745164571&partnerID=8YFLogxK
U2 - 10.1007/11596110_13
DO - 10.1007/11596110_13
M3 - Conference contribution
AN - SCOPUS:33745164571
SN - 3540307818
SN - 9783540307815
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 188
EP - 202
BT - Languages and Compilers for Parallel Computing - 15th Workshop, LCPC 2002, Revised Papers
PB - Springer
Y2 - 25 July 2002 through 27 July 2002
ER -