Transforming the Language of Life: Transformer Neural Networks for Protein Prediction Tasks

Ananthan Nambiar, Simon Liu, Mark Hopkins, Maeve Heflin, Sergei Maslov, Anna Ritz

Research output: Working paper

Abstract

The scientific community is rapidly generating protein sequence information, but only a fraction of these proteins can be experimentally characterized. While promising deep learning approaches for protein prediction tasks have emerged, they have computational limitations or are designed to solve a specific task. We present a Transformer neural network that pre-trains task-agnostic sequence representations. This model is fine-tuned to solve two different protein prediction tasks: protein family classification and protein interaction prediction. Our method is comparable to existing state-of-the art approaches for protein family classification, while being much more general than other architectures. Further, our method outperforms all other approaches for protein interaction prediction. These results offer a promising framework for fine-tuning the pre-trained sequence representations for other protein prediction tasks.
Original languageEnglish (US)
PublisherCold Spring Harbor Laboratory Press
Number of pages16
DOIs
StateIn preparation - Jun 16 2020

Publication series

NamebioRxiv

Keywords

  • neural networks
  • protein family classification
  • protein-protein interaction prediction
  • COVID-19
  • vaccine design

Fingerprint Dive into the research topics of 'Transforming the Language of Life: Transformer Neural Networks for Protein Prediction Tasks'. Together they form a unique fingerprint.

  • Cite this

    Nambiar, A., Liu, S., Hopkins, M., Heflin, M., Maslov, S., & Ritz, A. (2020). Transforming the Language of Life: Transformer Neural Networks for Protein Prediction Tasks. (bioRxiv). Cold Spring Harbor Laboratory Press. https://doi.org/10.1101/2020.06.15.153643