TY - GEN
T1 - Logical inference techniques for loop parallelization
AU - Oancea, Cosmin E.
AU - Rauchwerger, Lawrence
PY - 2012
Y1 - 2012
N2 - This paper presents a fully automatic approach to loop parallelization that integrates the use of static and run-time analysis and thus overcomes many known difficulties such as nonlinear and indirect array indexing and complex control flow. Our hybrid analysis framework validates the parallelization transformation by verifying the independence of the loop's memory references. To this end it represents array references using the USR (uniform set representation) language and expresses the independence condition as an equation, S = Ø, where S is a set expression representing array indexes. Using a language instead of an array-abstraction representation for S results in a smaller number of conservative approximations but exhibits a potentially-high runtime cost. To alleviate this cost we introduce a language translation F from the USR set-expression language to an equally rich language of predicates (F(S) ⇒ S = Ø). Loop parallelization is then validated using a novel logic inference algorithm that factorizes the obtained complex predicates (F(S)) into a sequence of sufficient-independence conditions that are evaluated first statically and, when needed, dynamically, in increasing order of their estimated complexities. We evaluate our automated solution on 26 benchmarks from PERFECTCLUB and SPEC suites and show that our approach is effective in parallelizing large, complex loops and obtains much better full program speedups than the Intel and IBM Fortran compilers.
AB - This paper presents a fully automatic approach to loop parallelization that integrates the use of static and run-time analysis and thus overcomes many known difficulties such as nonlinear and indirect array indexing and complex control flow. Our hybrid analysis framework validates the parallelization transformation by verifying the independence of the loop's memory references. To this end it represents array references using the USR (uniform set representation) language and expresses the independence condition as an equation, S = Ø, where S is a set expression representing array indexes. Using a language instead of an array-abstraction representation for S results in a smaller number of conservative approximations but exhibits a potentially-high runtime cost. To alleviate this cost we introduce a language translation F from the USR set-expression language to an equally rich language of predicates (F(S) ⇒ S = Ø). Loop parallelization is then validated using a novel logic inference algorithm that factorizes the obtained complex predicates (F(S)) into a sequence of sufficient-independence conditions that are evaluated first statically and, when needed, dynamically, in increasing order of their estimated complexities. We evaluate our automated solution on 26 benchmarks from PERFECTCLUB and SPEC suites and show that our approach is effective in parallelizing large, complex loops and obtains much better full program speedups than the Intel and IBM Fortran compilers.
KW - Auto-parallelization
KW - Independence predicates
KW - USR
UR - http://www.scopus.com/inward/record.url?scp=84863468202&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84863468202&partnerID=8YFLogxK
U2 - 10.1145/2254064.2254124
DO - 10.1145/2254064.2254124
M3 - Conference contribution
AN - SCOPUS:84863468202
SN - 9781450312059
T3 - Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)
SP - 509
EP - 520
BT - PLDI'12 - Proceedings of the 2012 ACM SIGPLAN Conference on Programming Language Design and Implementation
T2 - 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI'12
Y2 - 11 June 2012 through 16 June 2012
ER -