Boosting protein threading accuracy

Jian Peng, Jinbo Xu

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Protein threading is one of the most successful protein structure prediction methods. Most protein threading methods use a scoring function linearly combining sequence and structure features to measure the quality of a sequencetemplate alignment so that a dynamic programming algorithm can be used to optimize the scoring function. However, a linear scoring function cannot fully exploit interdependency among features and thus, limits alignment accuracy. This paper presents a nonlinear scoring function for protein threading, which not only can model interactions among different protein features, but also can be efficiently optimized using a dynamic programming algorithm. We achieve this by modeling the threading problem using a probabilistic graphical model Conditional Random Fields (CRF) and training the model using the gradient tree boosting algorithm. The resultant model is a nonlinear scoring function consisting of a collection of regression trees. Each regression tree models a type of nonlinear relationship among sequence and structure features. Experimental results indicate that this new threading model can effectively leverage weak biological signals and improve both alignment accuracy and fold recognition rate greatly.

Original languageEnglish (US)
Title of host publicationResearch in Computational Molecular Biology - 13th Annual International Conference, RECOMB 2009, Proceedings
Number of pages15
StatePublished - 2009
Externally publishedYes
Event13th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2009 - Tucson, AZ, United States
Duration: May 18 2009May 21 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5541 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Other13th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2009
Country/TerritoryUnited States
CityTucson, AZ


  • Conditional random fields
  • Gradient tree boosting
  • Nonlinear scoring function
  • Protein threading
  • Regression tree

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Boosting protein threading accuracy'. Together they form a unique fingerprint.

Cite this