TY - JOUR
T1 - Design and implementation of message-passing services for the Blue Gene/L supercomputer
AU - Almási, George
AU - Archer, Charles
AU - Castaños, José Gabriel
AU - Gunnels, John A.
AU - Erway, C. Chris
AU - Heidelberger, Philip
AU - Martorell, Xavier
AU - Moreira, José E.
AU - Pinnow, Kurt
AU - Ratterman, Joseph
AU - Steinmacher-Burow, Burkhard D.
AU - Gropp, William
AU - Toonen, Brian
PY - 2005
Y1 - 2005
N2 - The Blue Gene®/L (BG/L) supercomputer, with 65,536 dual-processor compute nodes, was designed from the ground up to support efficient execution of massively parallel message-passing programs. Part of this support is an optimized implementation of the Message Passing Interface (MPI), which leverages the hardware features of BG/L. MPI for BG/L is implemented on top of a more basic message-passing infrastructure called the message layer. This message layer can be used both to implement other higher-level libraries and directly by applications. MPI and the message layer are used in the two BG/L modes of operation: the coprocessor mode and the virtual node mode. Performance measurements show that our message-passing services deliver performance close to the hardware limits of the machine. They also show that dedicating one of the processors of a node to communication functions (coprocessor mode) greatly improves the message-passing bandwidth, whereas running two processes per compute node (virtual node mode) can have a positive impact on application performance.
AB - The Blue Gene®/L (BG/L) supercomputer, with 65,536 dual-processor compute nodes, was designed from the ground up to support efficient execution of massively parallel message-passing programs. Part of this support is an optimized implementation of the Message Passing Interface (MPI), which leverages the hardware features of BG/L. MPI for BG/L is implemented on top of a more basic message-passing infrastructure called the message layer. This message layer can be used both to implement other higher-level libraries and directly by applications. MPI and the message layer are used in the two BG/L modes of operation: the coprocessor mode and the virtual node mode. Performance measurements show that our message-passing services deliver performance close to the hardware limits of the machine. They also show that dedicating one of the processors of a node to communication functions (coprocessor mode) greatly improves the message-passing bandwidth, whereas running two processes per compute node (virtual node mode) can have a positive impact on application performance.
UR - http://www.scopus.com/inward/record.url?scp=21044456455&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=21044456455&partnerID=8YFLogxK
U2 - 10.1147/rd.492.0393
DO - 10.1147/rd.492.0393
M3 - Article
AN - SCOPUS:21044456455
SN - 0018-8646
VL - 49
SP - 393
EP - 406
JO - IBM Journal of Research and Development
JF - IBM Journal of Research and Development
IS - 2-3
ER -