Monte Carlo MP2 on Many Graphical Processing Units

Alexander E. Doran, So Hirata

Research output: Contribution to journalArticle

Abstract

In the Monte Carlo second-order many-body perturbation (MC-MP2) method, the long sum-of-product matrix expression of the MP2 energy, whose literal evaluation may be poorly scalable, is recast into a single high-dimensional integral of functions of electron pair coordinates, which is evaluated by the scalable method of Monte Carlo integration. The sampling efficiency is further accelerated by the redundant-walker algorithm, which allows a maximal reuse of electron pairs. Here, a multitude of graphical processing units (GPUs) offers a uniquely ideal platform to expose multilevel parallelism: fine-grain data-parallelism for the redundant-walker algorithm in which millions of threads compute and share orbital amplitudes on each GPU; coarse-grain instruction-parallelism for near-independent Monte Carlo integrations on many GPUs with few and infrequent interprocessor communications. While the efficiency boost by the redundant-walker algorithm on central processing units (CPUs) grows linearly with the number of electron pairs and tends to saturate when the latter exceeds the number of orbitals, on a GPU it grows quadratically before it increases linearly and then eventually saturates at a much larger number of pairs. This is because the orbital constructions are nearly perfectly parallelized on a GPU and thus completed in a near-constant time regardless of the number of pairs. In consequence, an MC-MP2/cc-pVDZ calculation of a benzene dimer is 2700 times faster on 256 GPUs (using 2048 electron pairs) than on two CPUs, each with 8 cores (which can use only up to 256 pairs effectively). We also numerically determine that the cost to achieve a given relative statistical uncertainty in an MC-MP2 energy increases as O(n3) or better with system size n, which may be compared with the O(n5) scaling of the conventional implementation of deterministic MP2. We thus establish the scalability of MC-MP2 with both system and computer sizes.

Original languageEnglish (US)
Pages (from-to)4821-4832
Number of pages12
JournalJournal of Chemical Theory and Computation
Volume12
Issue number10
DOIs
StatePublished - Oct 11 2016

    Fingerprint

ASJC Scopus subject areas

  • Computer Science Applications
  • Physical and Theoretical Chemistry

Cite this