TY - CHAP
T1 - The power of belady's algorithm in register allocation for long basic blocks
AU - Guo, Jia
AU - Garzarán, María Jesús
AU - Padua, David
N1 - Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2004
Y1 - 2004
N2 - Optimization techniques such as loop-unrolling and trace-scheduling can result in long straight-line codes. It is, however, unclear how well the register allocation algorithms of current compilers perform on these codes. Compilers may well have been optimized for human written codes, which are likely to have short basic blocks. To evaluate how the state of the art compilers behave on long straight-line codes, we wrote a compiler that implements the simple Belady's MIN algorithm. The main contribution of this paper is the evaluation of Belady's MIN algorithm when used for register allocation for long straight-line codes. These codes were executed on a MIPS R12000 processor. Our results show that applications compiled using Belady's MIN algorithm run faster than when compiled with the MIPSPro or GCC compiler. In particular, Fast Fourier Transforms (FFTs) of size 32 and 64 run 12% and 33% faster than when compiled using the MIPSPro compiler.
AB - Optimization techniques such as loop-unrolling and trace-scheduling can result in long straight-line codes. It is, however, unclear how well the register allocation algorithms of current compilers perform on these codes. Compilers may well have been optimized for human written codes, which are likely to have short basic blocks. To evaluate how the state of the art compilers behave on long straight-line codes, we wrote a compiler that implements the simple Belady's MIN algorithm. The main contribution of this paper is the evaluation of Belady's MIN algorithm when used for register allocation for long straight-line codes. These codes were executed on a MIPS R12000 processor. Our results show that applications compiled using Belady's MIN algorithm run faster than when compiled with the MIPSPro or GCC compiler. In particular, Fast Fourier Transforms (FFTs) of size 32 and 64 run 12% and 33% faster than when compiled using the MIPSPro compiler.
UR - http://www.scopus.com/inward/record.url?scp=20744448168&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=20744448168&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-24644-2_24
DO - 10.1007/978-3-540-24644-2_24
M3 - Chapter
AN - SCOPUS:20744448168
SN - 9783540246442
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 374
EP - 389
BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
A2 - Rauchwerger, Lawrence
PB - Springer
ER -