TY - GEN
T1 - An analytical approach to scheduling code for superscalar and VLIW architectures
AU - Chen, Shyh Kwei
AU - Fuchs, W.
AU - Hwu, Wen Mei
PY - 1994
Y1 - 1994
N2 - Superscalar and Very Long Instruction Word (VLIW) architectures exploit fine-grain parallelism to achieve better performance. Static scheduling techniques, such as trace scheduling [1] and superblock scheduling [2], can effectively produce compact code for these architectures. In this paper, we present an analytical approach for bookkeeping in code scheduling that alleviates the coding complexity and instruction duplication limitations of the previous approaches. We describe techniques that allow instructions to be moved around loop and if-then-else constructs using global information. We also show that according to the classification of the register sets, certain instructions can be moved around subroutine calls, since their register live ranges can be predetermined across the procedural boundaries at compile time. Performance is compared with respect to the speed-up, the code size and the scheduling time. Experimental results indicate that the code growth and the speed-up are both improved with a small increase in scheduling time.
AB - Superscalar and Very Long Instruction Word (VLIW) architectures exploit fine-grain parallelism to achieve better performance. Static scheduling techniques, such as trace scheduling [1] and superblock scheduling [2], can effectively produce compact code for these architectures. In this paper, we present an analytical approach for bookkeeping in code scheduling that alleviates the coding complexity and instruction duplication limitations of the previous approaches. We describe techniques that allow instructions to be moved around loop and if-then-else constructs using global information. We also show that according to the classification of the register sets, certain instructions can be moved around subroutine calls, since their register live ranges can be predetermined across the procedural boundaries at compile time. Performance is compared with respect to the speed-up, the code size and the scheduling time. Experimental results indicate that the code growth and the speed-up are both improved with a small increase in scheduling time.
UR - http://www.scopus.com/inward/record.url?scp=0007980074&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0007980074&partnerID=8YFLogxK
U2 - 10.1109/ICPP.1994.50
DO - 10.1109/ICPP.1994.50
M3 - Conference contribution
AN - SCOPUS:0007980074
SN - 0849324939
SN - 9780849324932
T3 - Proceedings of the International Conference on Parallel Processing
SP - I285-I292
BT - Proceedings of the 1994 International Conference on Parallel Processing, ICPP 1994
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 23rd International Conference on Parallel Processing, ICPP 1994
Y2 - 15 August 1994 through 19 August 1994
ER -