TY - JOUR
T1 - Optimization of the coupled cluster implementation in nwchem on petascale parallel architectures
AU - Anisimov, Victor M.
AU - Bauer, Gregory H.
AU - Chadalavada, Kalyana
AU - Olson, Ryan M.
AU - Glenski, Joseph W.
AU - Kramer, William T.C.
AU - Aprà, Edoardo
AU - Kowalski, Karol
N1 - Publisher Copyright:
© 2014 American Chemical Society.
PY - 2014/10/14
Y1 - 2014/10/14
N2 - The coupled cluster singles and doubles (CCSD) algorithm in the NWChem software package has been optimized to alleviate the communication bottleneck. This optimization provided a 2-fold to 5-fold speedup in the CCSD iteration time depending on the problem size and available memory, and improved the CCSD scaling to 20 000 nodes of the NCSA Blue Waters supercomputer. On 20 000 XE6 nodes of Blue Waters, a complete conventional CCSD(T) calculation of a system encountering 1042 basis functions and 103 occupied correlated orbitals obtained a performance of 0.32 petaflop/s and took 5 h and 24 min to complete. The reported time and performance included all stages of the calculation from initialization to termination for iterative single and double excitations as well as perturbative triples correction. In perturbative triples alone, the computation sustained a rate of 1.18 petaflop/s. The CCSD and (T) phases took approximately 3/4 and 1/4 of the total time to solution, respectively, showing that CCSD is the most time-consuming part at the large scale. The MP2, CCSD, and CCSD(T) computations in 6-311++G basis set performed on guanine-cytosine deoxydinucleotide monophosphate probed the conformational energy difference between the A- and B-conformations of single stranded DNA. Good agreement between MP2 and coupled cluster methods has been obtained, suggesting the utility of MP2 for conformational analysis in these systems. The study revealed a significant discrepancy between the quantum mechanical and classical force field predictions, suggesting a need to improve the dihedral parameters.
AB - The coupled cluster singles and doubles (CCSD) algorithm in the NWChem software package has been optimized to alleviate the communication bottleneck. This optimization provided a 2-fold to 5-fold speedup in the CCSD iteration time depending on the problem size and available memory, and improved the CCSD scaling to 20 000 nodes of the NCSA Blue Waters supercomputer. On 20 000 XE6 nodes of Blue Waters, a complete conventional CCSD(T) calculation of a system encountering 1042 basis functions and 103 occupied correlated orbitals obtained a performance of 0.32 petaflop/s and took 5 h and 24 min to complete. The reported time and performance included all stages of the calculation from initialization to termination for iterative single and double excitations as well as perturbative triples correction. In perturbative triples alone, the computation sustained a rate of 1.18 petaflop/s. The CCSD and (T) phases took approximately 3/4 and 1/4 of the total time to solution, respectively, showing that CCSD is the most time-consuming part at the large scale. The MP2, CCSD, and CCSD(T) computations in 6-311++G basis set performed on guanine-cytosine deoxydinucleotide monophosphate probed the conformational energy difference between the A- and B-conformations of single stranded DNA. Good agreement between MP2 and coupled cluster methods has been obtained, suggesting the utility of MP2 for conformational analysis in these systems. The study revealed a significant discrepancy between the quantum mechanical and classical force field predictions, suggesting a need to improve the dihedral parameters.
UR - http://www.scopus.com/inward/record.url?scp=84908010804&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84908010804&partnerID=8YFLogxK
U2 - 10.1021/ct500404c
DO - 10.1021/ct500404c
M3 - Article
C2 - 26588127
AN - SCOPUS:84908010804
SN - 1549-9618
VL - 10
SP - 4307
EP - 4316
JO - Journal of Chemical Theory and Computation
JF - Journal of Chemical Theory and Computation
IS - 10
ER -