PALM: Probabilistic Area Loss Minimization for Protein Sequence Alignment

Fan Ding, Nan Jiang, Jianzhu Ma, Jian Peng, Jinbo Xu, Yexiang Xue

Research output: Contribution to journalConference articlepeer-review

Abstract

Protein sequence alignment is a fundamental problem in computational structure biology and popular for protein 3D structural prediction and protein homology detection. Most of the developed programs for detecting protein sequence alignments are based upon the likelihood information of amino acids and are sensitive to alignment noises. We present a robust method PALM for modeling pairwise protein structure alignments, using the area distance to reduce the biological measurement noise. PALM generatively learn the alignment of two protein sequences with probabilistic area distance objective, which can denoise the measurement errors and offsets from different biologists. During learning, we show that the optimization is computationally efficient by estimating the gradients via dynamically sampling alignments. Empirically, we show that PALM can generate sequence alignments with higher precision and recall, as well as smaller area distance than the competing methods especially for long protein sequences and remote homologies. This study implies for learning over large-scale protein sequence alignment problems, one could potentially give PALM a try.

Original languageEnglish (US)
Pages (from-to)1100-1109
Number of pages10
JournalProceedings of Machine Learning Research
Volume161
StatePublished - 2021
Event37th Conference on Uncertainty in Artificial Intelligence, UAI 2021 - Virtual, Online
Duration: Jul 27 2021Jul 30 2021

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'PALM: Probabilistic Area Loss Minimization for Protein Sequence Alignment'. Together they form a unique fingerprint.

Cite this