TY - JOUR
T1 - Some essential techniques for developing efficient petascale applications
AU - Kalé, L. V.
PY - 2008
Y1 - 2008
N2 - Multiple petaflops-lass machines will appear during the coming year, and many multipetaflops machines are on the anvil. It will be a substantial challenge to make existing parallel CSE applications run efficiently on them, and even more challenging to design new applications that can effectively leverage the large computational power of these machines. Multicore chips and SMP nodes are becoming popular and pose challenges of their own. Further, a new set of challenges in productivity arise, especially if we wish to have a broader set of applications and people to use these machines. Reviewed here is a set of techniques that have proved useful in multiple parallel applications that have scaled to tens of thousands of processors, on machines such as the Blue Gene/L, Blue Gene/P, Cray XT3, and XT4. New challenges and potential solutions for the performance issues are identified. Issues presented by multicore chips and SMP nodes also rre addressed. Also reviewed are some new and old ideas for increasing productivity in parallel programming substantially.
AB - Multiple petaflops-lass machines will appear during the coming year, and many multipetaflops machines are on the anvil. It will be a substantial challenge to make existing parallel CSE applications run efficiently on them, and even more challenging to design new applications that can effectively leverage the large computational power of these machines. Multicore chips and SMP nodes are becoming popular and pose challenges of their own. Further, a new set of challenges in productivity arise, especially if we wish to have a broader set of applications and people to use these machines. Reviewed here is a set of techniques that have proved useful in multiple parallel applications that have scaled to tens of thousands of processors, on machines such as the Blue Gene/L, Blue Gene/P, Cray XT3, and XT4. New challenges and potential solutions for the performance issues are identified. Issues presented by multicore chips and SMP nodes also rre addressed. Also reviewed are some new and old ideas for increasing productivity in parallel programming substantially.
UR - http://www.scopus.com/inward/record.url?scp=65549162180&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=65549162180&partnerID=8YFLogxK
U2 - 10.1088/1742-6596/125/1/012036
DO - 10.1088/1742-6596/125/1/012036
M3 - Article
AN - SCOPUS:65549162180
SN - 1742-6588
VL - 125
JO - Journal of Physics: Conference Series
JF - Journal of Physics: Conference Series
M1 - 012036
ER -