Abstract
Recently, there have been several experimental and theoretical results showing significant performance benefits of recursive algorithms on both multi-level memory hierarchies and on shared-memory systems. In particular, such algorithms have the data reuse characteristics of a blocked algorithm that is simultaneously blocked at many different levels. Most existing applications, however, are written using ordinary loops. We present a new compiler transformation that can be used to convert loop nests into recursive form automatically. We show that the algorithm is fast and effective, handling loop nests with arbitrary nesting and control flow. The transformation achieves substantial performance improvements for several linear algebra codes even on a current system with a two level cache hierarchy. As a side-effect of this work, we also develop an improved algorithm for transitive dependence analysis (a powerful technique used in the recursion transformation and other loop transformations) that is much faster than the best previously known algorithm in practice.
Original language | English (US) |
---|---|
Pages | 169-181 |
Number of pages | 13 |
DOIs | |
State | Published - 2000 |
Externally published | Yes |
Event | ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation (PLDI) - Vancouver, BC, Canada Duration: Jun 18 2000 → Jun 21 2000 |
Conference
Conference | ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation (PLDI) |
---|---|
Country/Territory | Canada |
City | Vancouver, BC |
Period | 6/18/00 → 6/21/00 |
ASJC Scopus subject areas
- Software