TY - GEN
T1 - Direct self-consistent field computations on GPU clusters
AU - Shi, Guochun
AU - Kindratenko, Volodymyr
AU - Ufimtsev, Ivan
AU - Martinez, Todd
PY - 2010
Y1 - 2010
N2 - We present an implementation of one of the direct self-consistent-field (DSCF) calculation techniques, the restricted Hartree-Fock method, on a high-performance computing cluster outfitted with graphics processing units (GPUs) and demonstrate its effectiveness and scalability up to 128 cluster nodes on molecules of as many as 1,732 atoms. We discuss the overall parallel application architecture that relies on message passing interface for distributing workload among GPU cluster nodes and POSIX threads to manage the use of GPUs internal to each node. This approach of combining coarse and fine-grain parallelism on a distributed memory system allows to perform DSCF calculations on molecules that up until now have been unattainable due to the excessive computational requirements.
AB - We present an implementation of one of the direct self-consistent-field (DSCF) calculation techniques, the restricted Hartree-Fock method, on a high-performance computing cluster outfitted with graphics processing units (GPUs) and demonstrate its effectiveness and scalability up to 128 cluster nodes on molecules of as many as 1,732 atoms. We discuss the overall parallel application architecture that relies on message passing interface for distributing workload among GPU cluster nodes and POSIX threads to manage the use of GPUs internal to each node. This approach of combining coarse and fine-grain parallelism on a distributed memory system allows to perform DSCF calculations on molecules that up until now have been unattainable due to the excessive computational requirements.
KW - GPU
KW - Restricted Hartree-Fock
UR - http://www.scopus.com/inward/record.url?scp=77953970681&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77953970681&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2010.5470478
DO - 10.1109/IPDPS.2010.5470478
M3 - Conference contribution
AN - SCOPUS:77953970681
SN - 9781424464432
T3 - Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010
BT - Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010
T2 - 24th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2010
Y2 - 19 April 2010 through 23 April 2010
ER -