TY - JOUR
T1 - Tracing the birth and intrinsic disorder of loops and domains in protein evolution
AU - Caetano-Anollés, Gustavo
AU - Mughal, Fizza
AU - Aziz, M. Fayez
AU - Caetano-Anollés, Kelsey
N1 - Research was supported by grants from the National Science Foundation (MCB-0749836 and OISE-1132791), the United States Department of Agriculture (ILLU-802\u2013909 and ILLU-483\u2013625), and Blue Waters supercomputer allocations from the National Center for Supercomputing Applications (NCSA) to GCA.
PY - 2024/12
Y1 - 2024/12
N2 - Protein loops and structural domains are building blocks of molecular structure. They hold evolutionary memory and are largely responsible for the many functions and processes that drive the living world. Here, we briefly review two decades of phylogenomic data-driven research focusing on the emergence and evolution of these elemental architects of protein structure. Phylogenetic trees of domains reconstructed from the proteomes of organisms belonging to all three superkingdoms and viruses were used to build chronological timelines describing the origin of each domain and its embedded loops at different levels of structural abstraction. These timelines consistently recovered six distinct evolutionary phases and a most parsimonious evolutionary progression of cellular life. The timelines also traced the birth of domain structures from loops, which allowed to model their growth ab initio with AlphaFold2. Accretion decreased the disorder of the growing molecules, suggesting disorder is molecular size-dependent. A phylogenomic survey of disorder revealed that loops and domains evolved differently. Loops were highly disordered, disorder increased early in evolution, and ordered and moderate disordered structures were derived. Gradual replacement of loops with α-helix and β-strand bracing structures over time paved the way for the dominance of more disordered loop types. In contrast, ancient domains were ordered, with disorder evolving as a benefit acquired later in evolution. These evolutionary patterns explain inverse correlations between disorder and sequence length of loops and domains. Our findings provide a deep evolutionary view of the link between structure, disorder, flexibility, and function.
AB - Protein loops and structural domains are building blocks of molecular structure. They hold evolutionary memory and are largely responsible for the many functions and processes that drive the living world. Here, we briefly review two decades of phylogenomic data-driven research focusing on the emergence and evolution of these elemental architects of protein structure. Phylogenetic trees of domains reconstructed from the proteomes of organisms belonging to all three superkingdoms and viruses were used to build chronological timelines describing the origin of each domain and its embedded loops at different levels of structural abstraction. These timelines consistently recovered six distinct evolutionary phases and a most parsimonious evolutionary progression of cellular life. The timelines also traced the birth of domain structures from loops, which allowed to model their growth ab initio with AlphaFold2. Accretion decreased the disorder of the growing molecules, suggesting disorder is molecular size-dependent. A phylogenomic survey of disorder revealed that loops and domains evolved differently. Loops were highly disordered, disorder increased early in evolution, and ordered and moderate disordered structures were derived. Gradual replacement of loops with α-helix and β-strand bracing structures over time paved the way for the dominance of more disordered loop types. In contrast, ancient domains were ordered, with disorder evolving as a benefit acquired later in evolution. These evolutionary patterns explain inverse correlations between disorder and sequence length of loops and domains. Our findings provide a deep evolutionary view of the link between structure, disorder, flexibility, and function.
KW - Accretion
KW - Folds
KW - Intrinsic disorder
KW - Molecular evolution
KW - Origin of life
KW - Phylogenomic reconstruction
KW - Protein structure
UR - http://www.scopus.com/inward/record.url?scp=85209676389&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85209676389&partnerID=8YFLogxK
U2 - 10.1007/s12551-024-01251-0
DO - 10.1007/s12551-024-01251-0
M3 - Review article
C2 - 39830125
AN - SCOPUS:85209676389
SN - 1867-2450
VL - 16
SP - 723
EP - 735
JO - Biophysical Reviews
JF - Biophysical Reviews
IS - 6
M1 - 045002
ER -