TY - JOUR
T1 - The organization of domains in proteins obeys Menzerath-Altmann's law of language
AU - Shahzad, Khuram
AU - Mittenthal, Jay E.
AU - Caetano-Anollés, Gustavo
N1 - Publisher Copyright:
© 2015 Shahzad et al.
PY - 2015/8/11
Y1 - 2015/8/11
N2 - Background: The combination of domains in multidomain proteins enhances their function and structure but lengthens the molecules and increases their cost at cellular level. Methods: The dependence of domain length on the number of domains a protein holds was surveyed for a set of 60 proteomes representing free-living organisms from all kingdoms of life. Distributions were fitted using non-linear functions and fitted parameters interpreted with a formulation of decreasing returns. Results: We find that domain length decreases with increasing number of domains in proteins, following the Menzerath-Altmann (MA) law of language. Highly significant negative correlations exist for the set of proteomes examined. Mathematically, the MA law expresses as a power law relationship that unfolds when molecular persistence P is a function of domain accretion. P holds two terms, one reflecting the matter-energy cost of adding domains and extending their length, the other reflecting how domain length and number impinges on information and biophysics. The pattern of diminishing returns can therefore be explained as a frustrated interplay between the strategies of economy, flexibility and robustness, matching previously observed trade-offs in the domain makeup of proteomes. Proteomes of Archaea, Fungi and to a lesser degree Plants show the largest push towards molecular economy, each at their own economic stratum. Fungi increase domain size in single domain proteins while reinforcing the pattern of diminishing returns. In contrast, Metazoa, and to lesser degrees Protista and Bacteria, relax economy. Metazoa achieves maximum flexibility and robustness by harboring compact molecules and complex domain organization, offering a new functional vocabulary for molecular biology. Conclusions: The tendency of parts to decrease their size when systems enlarge is universal for language and music, and now for parts of macromolecules, extending the MA law to natural systems.
AB - Background: The combination of domains in multidomain proteins enhances their function and structure but lengthens the molecules and increases their cost at cellular level. Methods: The dependence of domain length on the number of domains a protein holds was surveyed for a set of 60 proteomes representing free-living organisms from all kingdoms of life. Distributions were fitted using non-linear functions and fitted parameters interpreted with a formulation of decreasing returns. Results: We find that domain length decreases with increasing number of domains in proteins, following the Menzerath-Altmann (MA) law of language. Highly significant negative correlations exist for the set of proteomes examined. Mathematically, the MA law expresses as a power law relationship that unfolds when molecular persistence P is a function of domain accretion. P holds two terms, one reflecting the matter-energy cost of adding domains and extending their length, the other reflecting how domain length and number impinges on information and biophysics. The pattern of diminishing returns can therefore be explained as a frustrated interplay between the strategies of economy, flexibility and robustness, matching previously observed trade-offs in the domain makeup of proteomes. Proteomes of Archaea, Fungi and to a lesser degree Plants show the largest push towards molecular economy, each at their own economic stratum. Fungi increase domain size in single domain proteins while reinforcing the pattern of diminishing returns. In contrast, Metazoa, and to lesser degrees Protista and Bacteria, relax economy. Metazoa achieves maximum flexibility and robustness by harboring compact molecules and complex domain organization, offering a new functional vocabulary for molecular biology. Conclusions: The tendency of parts to decrease their size when systems enlarge is universal for language and music, and now for parts of macromolecules, extending the MA law to natural systems.
UR - http://www.scopus.com/inward/record.url?scp=84938797396&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84938797396&partnerID=8YFLogxK
U2 - 10.1186/s12918-015-0192-9
DO - 10.1186/s12918-015-0192-9
M3 - Article
C2 - 26260760
AN - SCOPUS:84938797396
SN - 1752-0509
VL - 9
JO - BMC Systems Biology
JF - BMC Systems Biology
IS - 1
M1 - 44
ER -