A scalable method for run-time loop parallelization

Research output: Contribution to journalArticlepeer-review


Current parallelizing compilers do a reasonable job of extracting parallelism from programs with regular, well behaved, statically analyzable access patterns. However, they cannot extract a significant fraction of the avaialable, parallelism if the program has a complex and/or statically insufficiently defined access pattern, e.g., simulation programs with irregular domains and/or dynamically changing interactions. Since such programs represent a large fraction of all applications, techniques are needed for extracting their inherent parallelism at run-time. In this paper we give a new run-time technique for finding an optimal parallel execution schedule for a partially parallel loop, i.e., a loop whose parallelization requires synchronization to ensure that the iterations are executed in the correct order. Given the original loop, the compiler generates inspector code that performas run-time preprocessing of the loop's access pattern, and scheduler code that schedules (and executes) the loop interations. The inspector is fully parallel, uses no sychronization, and can be applied to any loop (from which an inspector can be extracted). In addition, it can implement at run-time the two most effective transformations for increasing the amount of parallelism in a loop:array privatization and reduction parallelization (elementwise). The ability to identify privatizable and reduction variables is very powerful since it eliminates the data dependences involving these variables and

Original languageEnglish (US)
Pages (from-to)537-576
Number of pages40
JournalInternational Journal of Parallel Programming
Issue number6
StatePublished - Dec 1995

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Information Systems

Fingerprint Dive into the research topics of 'A scalable method for run-time loop parallelization'. Together they form a unique fingerprint.

Cite this