TY - GEN
T1 - Iteration disambiguation for parallelism identification in time-sliced applications
AU - Ryoo, Shane
AU - Rodrigues, Christopher I.
AU - Hwu, Wen Mei W.
PY - 2008
Y1 - 2008
N2 - Media and scientific simulation applications have a large amount of parallelism that can be exploited in contemporary multi-core microprocessors. However, traditional pointer and array analysis techniques often fall short in automatically identifying this parallelism. This is due to the allocation and referencing patterns of time-slicing algorithms, where information flows from one time slice to the next. In these, an object is allocated within a loop and written to, with source data obtained from objects created in previous iterations of the loop. The objects are typically allocated at the same static call site through the same call chain in the call graph, making them indistinguishable by traditional heap-sensitive analysis techniques that use call chains to distinguish heap objects. As a result, the compiler cannot separate the source and destination objects within each time slice of the algorithm. In this work we discuss an analysis that quickly identifies these objects through a partially flow-sensitive technique called iteration disambiguation. This is done through a relatively simple aging mechanism. We show that this analysis can distinguish objects allocated in different time slices across a wide range of benchmark applications within tens of seconds even for complete media applications. We will also discuss the obstacles to automatically identifying the remaining parallelism in studied applications and propose methods to address them.
AB - Media and scientific simulation applications have a large amount of parallelism that can be exploited in contemporary multi-core microprocessors. However, traditional pointer and array analysis techniques often fall short in automatically identifying this parallelism. This is due to the allocation and referencing patterns of time-slicing algorithms, where information flows from one time slice to the next. In these, an object is allocated within a loop and written to, with source data obtained from objects created in previous iterations of the loop. The objects are typically allocated at the same static call site through the same call chain in the call graph, making them indistinguishable by traditional heap-sensitive analysis techniques that use call chains to distinguish heap objects. As a result, the compiler cannot separate the source and destination objects within each time slice of the algorithm. In this work we discuss an analysis that quickly identifies these objects through a partially flow-sensitive technique called iteration disambiguation. This is done through a relatively simple aging mechanism. We show that this analysis can distinguish objects allocated in different time slices across a wide range of benchmark applications within tens of seconds even for complete media applications. We will also discuss the obstacles to automatically identifying the remaining parallelism in studied applications and propose methods to address them.
UR - http://www.scopus.com/inward/record.url?scp=54249158455&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=54249158455&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-85261-2_8
DO - 10.1007/978-3-540-85261-2_8
M3 - Conference contribution
AN - SCOPUS:54249158455
SN - 3540852603
SN - 9783540852605
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 110
EP - 124
BT - Languages and Compilers for Parallel Computing - 20th International Workshop, LCPC 2007, Revised Selected Papers
T2 - 20th International Workshop on Languages and Compilers for Parallel Computing, LCPC 2007
Y2 - 11 October 2007 through 13 October 2007
ER -