Optimization of a parallel ocean general circulation model

Ping Wang, Daniel S. Katz, Yi Chao

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Global climate modeling is one of the grand challenges of computational science, and ocean modeling plays an important role in both understanding the current climatic conditions and predicting the future climate change. Three-dimensional time-dependent ocean general circulation models (OGCMs) require a large amount of memory and processing time to run realistic simulations. Recent advances in computing hardware have dramatically affected the prospect of studying the global climate. The significant computational resources of massively parallel supercomputers promise to make such studies feasible. In addition to using advanced hardware, designing and implementing a well-optimized parallel ocean code will significantly improve the computational performance and reduce the total research time to complete these studies. In our present work, we chose the most widely used OGCM code as our base code. This OGCM is based on the Parallel Ocean Program (POP) developed in FORTRAN 90 on the Los Alamos CM-2 Connection Machine by the Los Alamos ocean modeling research group. During the first half of 1994, the code was ported to the Cray T3D by Cray Research using SHMEM-based message passing. Since the code on the T3D was still time-consuming when large problems were encountered, improving the code performance was considered essential. We have developed several general strategies to optimize the ocean general circulation model on the Cray T3D. These strategies include memory optimization, effective use of arithmetic pipelines, and usage of optimized libraries. The optimized code runs 2 to 2.5 times faster than the original code, which gives significant performance improvements for modeling large scaled ocean flows. Many test runs for both of the original and the optimized code have been carried out on the Cray T3D using various numbers of processors (1-256). Comparisons are made for a variety of real-world problems. A nearly linear scaling performance line is obtained for the optimized code, while the speed up data of the optimized code also shows excellent improvement over the original code. In addition to discussing the optimization of the code, we also address the issue of portability. Given the short life cycle of the massively parallel computer, usually on the order of three to five years, we emphasize the portability of the ocean model and the associated optimization routines across several computing platforms. Currently, the ocean modeling code has been ported successfully to the Hewlett Packard (HP)/Convex SPP-2000, and is readily portable to Cray T3E. This paper reports our efforts to optimize the parallel implementations of the oceanic model. So far, the work has focused on improving the load balancing and single node performance of the code on the Cray T3D. As a result, the atmosphere and ocean model components running side-by-side can achieve a performance level of slightly more than 10 GFLOPS on 512 processors of that machine. We have also developed a user-friendly coupling interface with atmospheric and biogeochemical models, in order to make the global climate modeling more complete and more realistic.

Original languageEnglish (US)
Title of host publicationProceedings of the 1997 ACM/IEEE Conference on Supercomputing, SC 1997
PublisherAssociation for Computing Machinery
ISBN (Print)0897919858, 9780897919852
StatePublished - 1997
Externally publishedYes
Event1997 ACM/IEEE Conference on Supercomputing, SC 1997 - San Jose, CA, United States
Duration: Nov 15 1997Nov 21 1997

Publication series

NameProceedings of the International Conference on Supercomputing


Other1997 ACM/IEEE Conference on Supercomputing, SC 1997
Country/TerritoryUnited States
CitySan Jose, CA


  • Ocean modeling
  • Optimization
  • Parallel computations

ASJC Scopus subject areas

  • Computer Science(all)


Dive into the research topics of 'Optimization of a parallel ocean general circulation model'. Together they form a unique fingerprint.

Cite this