TY - JOUR
T1 - The EFI Web Resource for Genomic Enzymology Tools
T2 - Leveraging Protein, Genome, and Metagenome Databases to Discover Novel Enzymes and Metabolic Pathways
AU - Zallot, Rémi
AU - Oberg, Nils
AU - Gerlt, John A.
N1 - Publisher Copyright:
Copyright © 2019 American Chemical Society.
PY - 2019/10/15
Y1 - 2019/10/15
N2 - The assignment of functions to uncharacterized proteins discovered in genome projects requires easily accessible tools and computational resources for large-scale, user-friendly leveraging of the protein, genome, and metagenome databases by experimentalists. This article describes the web resource developed by the Enzyme Function Initiative (EFI; accessed at https://efi.igb.illinois.edu/) that provides "genomic enzymology" tools ("web tools") for (1) generating sequence similarity networks (SSNs) for protein families (EFI-EST); (2) analyzing and visualizing genome context of the proteins in clusters in SSNs (in genome neighborhood networks, GNNs, and genome neighborhood diagrams, GNDs) (EFI-GNT); and (3) prioritizing uncharacterized SSN clusters for functional assignment based on metagenome abundance (chemically guided functional profiling, CGFP) (EFI-CGFP). The SSNs generated by EFI-EST are used as the input for EFI-GNT and EFI-CGFP, enabling easy transfer of information among the tools. The networks are visualized and analyzed using Cytoscape, a widely used desktop application; GNDs and CGFP heatmaps summarizing metagenome abundance are viewed within the tools. We provide a detailed example of the integrated use of the tools with an analysis of glycyl radical enzyme superfamily (IPR004184) found in the human gut microbiome. This analysis demonstrates that (1) SwissProt annotations are not always correct, (2) large-scale genome context analyses allow the prediction of novel metabolic pathways, and (3) metagenome abundance can be used to identify/prioritize uncharacterized proteins for functional investigation.
AB - The assignment of functions to uncharacterized proteins discovered in genome projects requires easily accessible tools and computational resources for large-scale, user-friendly leveraging of the protein, genome, and metagenome databases by experimentalists. This article describes the web resource developed by the Enzyme Function Initiative (EFI; accessed at https://efi.igb.illinois.edu/) that provides "genomic enzymology" tools ("web tools") for (1) generating sequence similarity networks (SSNs) for protein families (EFI-EST); (2) analyzing and visualizing genome context of the proteins in clusters in SSNs (in genome neighborhood networks, GNNs, and genome neighborhood diagrams, GNDs) (EFI-GNT); and (3) prioritizing uncharacterized SSN clusters for functional assignment based on metagenome abundance (chemically guided functional profiling, CGFP) (EFI-CGFP). The SSNs generated by EFI-EST are used as the input for EFI-GNT and EFI-CGFP, enabling easy transfer of information among the tools. The networks are visualized and analyzed using Cytoscape, a widely used desktop application; GNDs and CGFP heatmaps summarizing metagenome abundance are viewed within the tools. We provide a detailed example of the integrated use of the tools with an analysis of glycyl radical enzyme superfamily (IPR004184) found in the human gut microbiome. This analysis demonstrates that (1) SwissProt annotations are not always correct, (2) large-scale genome context analyses allow the prediction of novel metabolic pathways, and (3) metagenome abundance can be used to identify/prioritize uncharacterized proteins for functional investigation.
UR - http://www.scopus.com/inward/record.url?scp=85073165701&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85073165701&partnerID=8YFLogxK
U2 - 10.1021/acs.biochem.9b00735
DO - 10.1021/acs.biochem.9b00735
M3 - Article
C2 - 31553576
AN - SCOPUS:85073165701
SN - 0006-2960
VL - 58
SP - 4169
EP - 4182
JO - Biochemistry
JF - Biochemistry
IS - 41
ER -