Reductive evolution of proteomes and protein structures

Minglei Wang, Charles G. Kurland, Gustavo Caetano-Anollés

Research output: Contribution to journalArticlepeer-review


The lengths of orthologous protein families in Eukarya are almost double the lengths found in Bacteria and Archaea. Here we examine protein structures in 745 genomes and show that protein length differences between superkingdoms arise as much shorter prokaryotic nondomain linker sequences. Eukaryotic, bacterial, and archaeal linkers are 250, 86, and 73 aa residues in length, respectively, whereas folded domain sequences are 281, 280, and 256 residues, respectively. Cryptic domains match linkers (P > 0.0001) with probabilities ranging between 0.022 and 0.042; accordingly, they do not affect length estimates significantly. Linker sequences support intermolecular binding within proteomes and they are probably enriched in intrinsically disordered regions as well. Reductively evolved linker sequence lengths in growth rate maximized cells should be proportional to proteome diversity. By using total in-frame coding capacity of a genome [i.e., coding sequence (CDS)] as a reliable measure of proteome diversity, we find linker lengths of prokaryotes clearly evolve in proportion to CDS values, whereas those of eukaryotes are more randomly larger than expected. Domain lengths scarcely change over the entire range of CDS values. Thus, the protein linkers of prokaryotes evolve reductively whereas those of eukaryotes do not.

Original languageEnglish (US)
Pages (from-to)11954-11958
Number of pages5
JournalProceedings of the National Academy of Sciences of the United States of America
Issue number29
StatePublished - Jul 19 2011


  • Evolutionary constraint
  • Intrinsic disorder
  • Protein domain

ASJC Scopus subject areas

  • General


Dive into the research topics of 'Reductive evolution of proteomes and protein structures'. Together they form a unique fingerprint.

Cite this