TY - JOUR
T1 - Unbiased, scalable sampling of protein loop conformations from probabilistic priors
AU - Zhang, Yajia
AU - Hauser, Kris
N1 - Funding Information:
The authors thank Predrag Radivojac for valuable discussions that inspired us to start this project and helped clarify our understanding of protein structure and function. This research is partially supported by NSF Grant No. 1218534.
PY - 2013
Y1 - 2013
N2 - Background: Protein loops are flexible structures that are intimately tied to function, but understanding loop motion and generating loop conformation ensembles remain significant computational challenges. Discrete search techniques scale poorly to large loops, optimization and molecular dynamics techniques are prone to local minima, and inverse kinematics techniques can only incorporate structural preferences in adhoc fashion. This paper presents Sub-Loop Inverse Kinematics Monte Carlo (SLIKMC), a new Markov chain Monte Carlo algorithm for generating conformations of closed loops according to experimentally available, heterogeneous structural preferences. Results: Our simulation experiments demonstrate that the method computes high-scoring conformations of large loops (>10 residues) orders of magnitude faster than standard Monte Carlo and discrete search techniques. Two new developments contribute to the scalability of the new method. First, structural preferences are specified via a probabilistic graphical model (PGM) that links conformation variables, spatial variables (e.g., atom positions), constraints and prior information in a unified framework. The method uses a sparse PGM that exploits locality of interactions between atoms and residues. Second, a novel method for sampling sub-loops is developed to generate statistically unbiased samples of probability densities restricted by loop-closure constraints. Conclusion: Numerical experiments confirm that SLIKMC generates conformation ensembles that are statistically consistent with specified structural preferences. Protein conformations with 100+ residues are sampled on standard PC hardware in seconds. Application to proteins involved in ion-binding demonstrate its potential as a tool for loop ensemble generation and missing structure completion.
AB - Background: Protein loops are flexible structures that are intimately tied to function, but understanding loop motion and generating loop conformation ensembles remain significant computational challenges. Discrete search techniques scale poorly to large loops, optimization and molecular dynamics techniques are prone to local minima, and inverse kinematics techniques can only incorporate structural preferences in adhoc fashion. This paper presents Sub-Loop Inverse Kinematics Monte Carlo (SLIKMC), a new Markov chain Monte Carlo algorithm for generating conformations of closed loops according to experimentally available, heterogeneous structural preferences. Results: Our simulation experiments demonstrate that the method computes high-scoring conformations of large loops (>10 residues) orders of magnitude faster than standard Monte Carlo and discrete search techniques. Two new developments contribute to the scalability of the new method. First, structural preferences are specified via a probabilistic graphical model (PGM) that links conformation variables, spatial variables (e.g., atom positions), constraints and prior information in a unified framework. The method uses a sparse PGM that exploits locality of interactions between atoms and residues. Second, a novel method for sampling sub-loops is developed to generate statistically unbiased samples of probability densities restricted by loop-closure constraints. Conclusion: Numerical experiments confirm that SLIKMC generates conformation ensembles that are statistically consistent with specified structural preferences. Protein conformations with 100+ residues are sampled on standard PC hardware in seconds. Application to proteins involved in ion-binding demonstrate its potential as a tool for loop ensemble generation and missing structure completion.
KW - Conformation sampling
KW - Monte Carlo methods
KW - ensemble generation
KW - graphical models
KW - protein loops
UR - http://www.scopus.com/inward/record.url?scp=84887494973&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84887494973&partnerID=8YFLogxK
U2 - 10.1186/1472-6807-13-S1-S9
DO - 10.1186/1472-6807-13-S1-S9
M3 - Article
C2 - 24565175
AN - SCOPUS:84887494973
SN - 1472-6807
VL - 13
JO - BMC Structural Biology
JF - BMC Structural Biology
IS - SUPPL.1
M1 - S9
ER -