TY - GEN
T1 - Improving reaction kernel performance in lattice microbes
T2 - 30th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2016
AU - Hallock, Michael J.
AU - Luthey-Schulten, Zaida
PY - 2016/7/18
Y1 - 2016/7/18
N2 - The reaction kernel for MPD-RDME, the GPU-accelerated reaction-diffusion master equation solver found in Lattice Microbes uses a large number of kinetic parameters to describe a biochemical network. Many of these parameters are required to compute the system's total reaction propensity, which is used to stochastically evaluate whether a reaction event takes place. In this paper, we examine two techniques for accelerating performance by modifying the total propensity calculation. The first technique is to use a particle-based approach to compute propensities from discrete particles and particle pairs. We find this technique results in a dramatic improvement in performance for a complex model, approximately 60 times faster. The second technique uses run-time generated source code to automatically create executable code tailored for the biological model being simulated. The removal of all memory reads for constant parameters increases performance for less complex models.
AB - The reaction kernel for MPD-RDME, the GPU-accelerated reaction-diffusion master equation solver found in Lattice Microbes uses a large number of kinetic parameters to describe a biochemical network. Many of these parameters are required to compute the system's total reaction propensity, which is used to stochastically evaluate whether a reaction event takes place. In this paper, we examine two techniques for accelerating performance by modifying the total propensity calculation. The first technique is to use a particle-based approach to compute propensities from discrete particles and particle pairs. We find this technique results in a dramatic improvement in performance for a complex model, approximately 60 times faster. The second technique uses run-time generated source code to automatically create executable code tailored for the biological model being simulated. The removal of all memory reads for constant parameters increases performance for less complex models.
UR - http://www.scopus.com/inward/record.url?scp=84991712925&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84991712925&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW.2016.118
DO - 10.1109/IPDPSW.2016.118
M3 - Conference contribution
AN - SCOPUS:84991712925
T3 - Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016
SP - 428
EP - 434
BT - Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 23 May 2016 through 27 May 2016
ER -