Abstract
Native proteins have been optimized by evolution simultaneously for structure and sequence. Structural databases reflect this interdependency. In this paper, we present a new statistical potential for a reduced backbone representation that has both structure and sequence characteristics as variables. We use information from structural data available in the Protein Coil Library, selected on the basis of resolution and refinement factor. In these structures, the nonlocal interactions are randomly distributed and, thus, average out in statistics, so structural propensities due to local backbone-based interactions can be studied separately. We collect data in the form of local sequence-specific φ-ψ backbone dihedral pairs. From these data, we construct dihedral probability density functions (DPDFs) that quantify any adjacent φ-ψ pair distribution in the context of all possible combinations of local residue types. We use a probabilistic analysis to deduce how the correlations encoded in the various DPDFs as well as in residue frequencies propagate along the sequence and can be cumulated in a statistical potential capable of efficiently scoring a loop by its backbone conformation and sequence only. Our potential is able to identify with high accuracy the native structure of a loop with a given sequence among possible alternative conformations from sets of well-constructed decoys. Conversely, the potential can also be used for sequence prediction problems and is shown to score the native sequence of a given loop structure among the most fit of the possible sequence combinations. Applications for both structure prediction and sequence design are discussed.
Original language | English (US) |
---|---|
Pages (from-to) | 1859-1869 |
Number of pages | 11 |
Journal | Journal of Physical Chemistry B |
Volume | 114 |
Issue number | 5 |
DOIs | |
State | Published - Feb 11 2010 |
ASJC Scopus subject areas
- Physical and Theoretical Chemistry
- Surfaces, Coatings and Films
- Materials Chemistry