Abstract
Deployment of a large parallel system is typically a very complex process, involving several steps of preparation, delivery, installation, testing and acceptance. Despite the availability of various petascale machines currently, the steps and lessons from their deployment are rarely described in the literature. This paper presents the experiences observed during the deployment of Blue Waters, the largest supercomputer ever built by Cray and one of the most powerful machines currently available for open science. The presentation is focused on the final deployment steps, where the system was intensively tested and accepted by NCSA. After a brief introduction of the Blue Waters architecture, a detailed description of the set of acceptance tests employed is provided, including many of the obtained results. This is followed by the major lessons learned during the process. Those experiences and lessons should be useful to guide similarly complex deployments in the future.
Original language | English (US) |
---|---|
Pages (from-to) | 198-209 |
Number of pages | 12 |
Journal | Procedia Computer Science |
Volume | 29 |
DOIs | |
State | Published - Jan 1 2014 |
Event | 14th Annual International Conference on Computational Science, ICCS 2014 - Cairns, QLD, Australia Duration: Jun 10 2014 → Jun 12 2014 |
Keywords
- Acceptance testing
- Large system deployment
- Petascale performance
ASJC Scopus subject areas
- Computer Science(all)