TY - GEN
T1 - Multi-protocol active messages on a cluster of SMP's
AU - Lumetta, Steven S.
AU - Mainwaring, Alan M.
AU - Culler, David E.
PY - 1997
Y1 - 1997
N2 - Clusters of multiprocessors, or Clumps, promise to be the supercomputers of the fu- ture, but obtaining high performance on these architectures requires an understanding of interactions between the multiple levels of interconnection. In this paper, we present the rst multi-protocol implemen tation of a lightweight message layer|a v ersion of Ac- tive Messages-II running on a cluster of Sun En terprise 5000 servers connected with Myrinet. This researc h brings together several pieces of high-performance interconnec- tion technology: bus backplanes for symmetric m ultiprocessors, low-latency networks for connections between machines, and simple, user-level primitives for comm unication. The paper describes the shared memory message-passing protocol and analyzes the multi-protocol implemen tation with both microbenchmarks and Split-C applications. Three aspects of the comm unication layer are critical to performance: The overhead of cache-coherence mechanisms, the method of managing concurren t access, and the cost of accessing state with the slower protocol. Through the use of an adaptive polling strategy, the multi-protocol implemen tation limits performance in teractions between the protocols, delivering up to 160 MB/s of bandwidth with 3.6 microsecond end-to-end latency. Applications within an SMP benefit from this fast comm unication, running up to 75% faster than on a network of uniprocessor workstations. Applications running on the entire Clump are limited b y the balance of NIC's to processors in our system, and are typically slower than on the NOW. These results illustrate sev eral potential pitfalls for the Clumps architecture.
AB - Clusters of multiprocessors, or Clumps, promise to be the supercomputers of the fu- ture, but obtaining high performance on these architectures requires an understanding of interactions between the multiple levels of interconnection. In this paper, we present the rst multi-protocol implemen tation of a lightweight message layer|a v ersion of Ac- tive Messages-II running on a cluster of Sun En terprise 5000 servers connected with Myrinet. This researc h brings together several pieces of high-performance interconnec- tion technology: bus backplanes for symmetric m ultiprocessors, low-latency networks for connections between machines, and simple, user-level primitives for comm unication. The paper describes the shared memory message-passing protocol and analyzes the multi-protocol implemen tation with both microbenchmarks and Split-C applications. Three aspects of the comm unication layer are critical to performance: The overhead of cache-coherence mechanisms, the method of managing concurren t access, and the cost of accessing state with the slower protocol. Through the use of an adaptive polling strategy, the multi-protocol implemen tation limits performance in teractions between the protocols, delivering up to 160 MB/s of bandwidth with 3.6 microsecond end-to-end latency. Applications within an SMP benefit from this fast comm unication, running up to 75% faster than on a network of uniprocessor workstations. Applications running on the entire Clump are limited b y the balance of NIC's to processors in our system, and are typically slower than on the NOW. These results illustrate sev eral potential pitfalls for the Clumps architecture.
UR - http://www.scopus.com/inward/record.url?scp=84900296956&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84900296956&partnerID=8YFLogxK
U2 - 10.1145/509593.509596
DO - 10.1145/509593.509596
M3 - Conference contribution
AN - SCOPUS:84900296956
SN - 0897919858
SN - 9780897919852
T3 - Proceedings of the International Conference on Supercomputing
BT - Proceedings of the 1997 ACM/IEEE Conference on Supercomputing, SC 1997
PB - Association for Computing Machinery
T2 - 1997 ACM/IEEE Conference on Supercomputing, SC 1997
Y2 - 15 November 1997 through 21 November 1997
ER -