Recent Advances in Machine Learning Variant Effect Prediction Tools for Protein Engineering

Jesse Horne, Diwakar Shukla

Research output: Contribution to journalReview articlepeer-review

Abstract

Proteins are Nature’s molecular machinery and comprise diverse roles while consisting of chemically similar building blocks. In recent years, protein engineering and design have become important research areas, with many applications in the pharmaceutical, energy, and biocatalysis fields, among others─where the aim is to ultimately create a protein given desired structural and functional properties. It is often critical to model the relationship between a protein’s sequence, folded structure, and biological function to assist in such protein engineering pursuits. However, significant challenges remain in concretely mapping an amino acid sequence to specific protein properties and biological activities. Mutations may enhance or diminish molecular protein function, and the epistatic interactions between mutations result in an inherently complex mapping between genetic modifications and protein function. Therefore, estimating the quantitative effects of mutations on protein function(s) remains a grand challenge of biology, bioinformatics, and many related fields and would rapidly accelerate protein engineering tasks when successful. Such estimation is often known as variant effect prediction (VEP). However, progress has been demonstrated in recent years with the development of machine learning (ML) methods in modeling the relationship between mutations and protein function. In this Review, recent advances in variant effect prediction (VEP) are discussed as tools for protein engineering, focusing on techniques incorporating gains from the broader ML community and challenges in estimating biomolecular functional differences. Primary developments highlighted include convolutional neural networks, graph neural networks, and natural language embeddings for protein sequences.

Original languageEnglish (US)
Pages (from-to)6235-6245
Number of pages11
JournalIndustrial and Engineering Chemistry Research
Volume61
Issue number19
DOIs
StatePublished - May 18 2022

ASJC Scopus subject areas

  • General Chemistry
  • General Chemical Engineering
  • Industrial and Manufacturing Engineering

Fingerprint

Dive into the research topics of 'Recent Advances in Machine Learning Variant Effect Prediction Tools for Protein Engineering'. Together they form a unique fingerprint.

Cite this