Design of MILC lattice QCD application for GPU clusters

Guochun Shi, Steven Gottlieb, Aaron Torok, Volodymyr Kindratenko

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present an implementation of the improved staggered quark action lattice QCD computation designed for execution on a GPU cluster. The parallelization strategy is based on dividing the space-time lattice along the time dimension and distributing the sub-lattices among the GPU cluster nodes. We provide a mixed-precision floating-point GPU implementation of the multi-mass conjugate gradient solver. Our single GPU implementation of the conjugate gradient solver achieves a 9x performance improvement over the highly optimized code executed on a state-of-the-art eight-core CPU node. The overall application executes almost six times faster on a GPU-enabled cluster vs. a conventional multi-core cluster. The developed code is currently used for running production QCD calculations with electromagnetic corrections.

Original languageEnglish (US)
Title of host publicationProceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011
Pages363-371
Number of pages9
DOIs
StatePublished - Oct 3 2011
Event25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011 - Anchorage, AK, United States
Duration: May 16 2011May 20 2011

Publication series

NameProceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011

Other

Other25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011
CountryUnited States
CityAnchorage, AK
Period5/16/115/20/11

Keywords

  • GPU
  • MILC
  • Quantum chromodynamics
  • conjugate gradient

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications

Fingerprint Dive into the research topics of 'Design of MILC lattice QCD application for GPU clusters'. Together they form a unique fingerprint.

Cite this