Abstract

BackgroundThe combination of domains in multidomain proteins enhances their function and structure but lengthens the molecules and increases their cost at cellular level.MethodsThe dependence of domain length on the number of domains a protein holds was surveyed for a set of 60 proteomes representing free-living organisms from all kingdoms of life. Distributions were fitted using non-linear functions and fitted parameters interpreted with a formulation of decreasing returns.ResultsWe find that domain length decreases with increasing number of domains in proteins, following the Menzerath-Altmann (MA) law of language. Highly significant negative correlations exist for the set of proteomes examined. Mathematically, the MA law expresses as a power law relationship that unfolds when molecular persistence P is a function of domain accretion. P holds two terms, one reflecting the matter-energy cost of adding domains and extending their length, the other reflecting how domain length and number impinges on information and biophysics. The pattern of diminishing returns can therefore be explained as a frustrated interplay between the strategies of economy, flexibility and robustness, matching previously observed trade-offs in the domain makeup of proteomes. Proteomes of Archaea, Fungi and to a lesser degree Plants show the largest push towards molecular economy, each at their own economic stratum. Fungi increase domain size in single domain proteins while reinforcing the pattern of diminishing returns. In contrast, Metazoa, and to lesser degrees Protista and Bacteria, relax economy. Metazoa achieves maximum flexibility and robustness by harboring compact molecules and complex domain organization, offering a new functional vocabulary for molecular biology.ConclusionsThe tendency of parts to decrease their size when systems enlarge is universal for language and music, and now for parts of macromolecules, extending the MA law to natural systems.

Highlights

  • The combination of domains in multidomain proteins enhances their function and structure but lengthens the molecules and increases their cost at cellular level

  • The longer the protein the smaller its structural domains We studied the dependence of the average domain length of a protein on the number of protein domains it holds (k) for a set of 60 proteomes representing organisms in superkingdoms Archaea and Bacteria and kingdoms Metazoa, Fungi, Plants and Protista of superkingdom Eukarya

  • Given the theoretical link that exists between b and both domain cooperativity and stability elaborated above, and the high surface area to volume ratio detected in new emergent proteins [27], we propose that the ‘compressible’ property is associated with contact density in domain structures, i.e., the fraction of buried sites in the atomic structure

Read more

Summary

Methods

We selected 60 proteomes of free-living species from the highly curated dataset of Wang et al [25], which holds ~ 3 million sequences (from 745 proteomes) with structural domains assigned using hidden Markov models (HMMs) of structural recognition in SUPERFAMILY [46]. We averaged out domain lengths (Ykj ) against each domain number (k) for the selected proteins. Selecting K’ ≤ 5 domains decreases the number of proteins retained from 99.7 to 95 %. This brackets the K’ = 13 domain boundary by exactly k = ±7. Effective average protein lengths (Le) were calculated using the following eq (12). L1 describes the average length of single domain proteins, which serves to define an upper bound for the MA-dependency of a proteome. Le describes the sum of the length of individual domain constituents of a protein, which is an indicator of mass economy for growth rate optimization.

Results
Background
Results and discussion
60 Bacteria Petrotoga mobilis
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.