Enzyme function prediction using contrastive learning

Tianhao Yu, Haiyang Cui, Jianan Canal Li, Yunan Luo, Guangde Jiang, Huimin Zhao

Research output: Contribution to journalArticlepeer-review


Enzyme function annotation is a fundamental challenge, and numerous computational tools have been developed. However, most of these tools cannot accurately predict functional annotations, such as enzyme commission (EC) number, for less-studied proteins or those with previously uncharacterized functions or multiple activities. We present a machine learning algorithm named CLEAN (contrastive learning–enabled enzyme annotation) to assign EC numbers to enzymes with better accuracy, reliability, and sensitivity compared with the state-of-the-art tool BLASTp. The contrastive learning framework empowers CLEAN to confidently (i) annotate understudied enzymes, (ii) correct mislabeled enzymes, and (iii) identify promiscuous enzymes with two or more EC numbers—functions that we demonstrate by systematic in silico and in vitro experiments. We anticipate that this tool will be widely used for predicting the functions of uncharacterized enzymes, thereby advancing many fields, such as genomics, synthetic biology, and biocatalysis. With rapidly growing genomic and metagenomic databases, we have vastly more sequence data than functional data for enzymes. Accurate functional annotation from sparse experimental evidence is therefore crucial for analysis and applications when working from sequence data. Hoping to circumvent the limitations of current approaches, Yu et al. developed a machine learning model based on contrastive learning that performs particularly well at discerning enzyme function. In addition to comparing the performance of the method with existing tools, the authors experimentally validated predicted functions of 36 enzymes that form carbon–halogen bonds. They found excellent prediction accuracy and the ability to distinguish between similar activities. —MAF A contrastive learning algorithm enables accurate enzyme function annotation.
Original languageEnglish (US)
Pages (from-to)1358-1363
Number of pages6
Issue number6639
StatePublished - Mar 31 2023


Dive into the research topics of 'Enzyme function prediction using contrastive learning'. Together they form a unique fingerprint.

Cite this