Abstract

Communication is an undisputed central activity of life that requires an evolving molecular language. It conveys meaning through messages and vocabularies. Here, I explore the existence of a growing vocabulary in the molecules and molecular functions of the microbial world. There are clear correspondences between the lexicon, syntax, semantics, and pragmatics of language organization and the module, structure, function, and fitness paradigms of molecular biology. These correspondences are constrained by universal laws and engineering principles. Macromolecular structure, for example, follows quantitative linguistic patterns arising from statistical laws that are likely universal, including the Zipf’s law, a special case of the scale-free distribution, the Heaps’ law describing sublinear growth typical of economies of scales, and the Menzerath–Altmann’s law, which imposes size-dependent patterns of decreasing returns. Trade-off solutions between principles of economy, flexibility, and robustness define a “triangle of persistence” describing the impact of the environment on a biological system. The pragmatic landscape of the triangle interfaces with the syntax and semantics of molecular languages, which together with comparative and evolutionary genomic data can explain global patterns of diversification of cellular life. The vocabularies of proteins (proteomes) and functions (functionomes) revealed a significant universal lexical core supporting a universal common ancestor, an ancestral evolutionary link between Bacteria and Eukarya, and distinct reductive evolutionary strategies of language compression in Archaea and Bacteria. A “causal” word cloud strategy inspired by the dependency grammar paradigm used in catenae unfolded the evolution of lexical units associated with Gene Ontology terms at different levels of ontological abstraction. While Archaea holds the smallest, oldest, and most homogeneous vocabulary of all superkingdoms, Bacteria heterogeneously apportions a more complex vocabulary, and Eukarya pushes functional innovation through mechanisms of flexibility and robustness.

Highlights

  • “The place where I come from is a small town, They think so small, they use small words But not me, I’m smarter than that, I worked it out I’ve been stretching my mouth, to let those big words come right out”

  • I discuss how language laws are constrained by the engineering of the emerging biological systems and trade-off solutions between economy, flexibility, and robustness

  • I focus on the evolution of molecular and functional vocabularies and how they reveal illuminating patterns of molecular origin and diversification that are consistent with engineering trade-offs

Read more

Summary

INTRODUCTION

“The place where I come from is a small town, They think so small, they use small words But not me, I’m smarter than that, I worked it out I’ve been stretching my mouth, to let those big words come right out”. Organisms of the six kingdoms of life exhibit clear patterns of scope, budget, flexibility, and robustness derived from significant evidential support (e.g., speed, cell size, spatial range, life span, nutrition, molecular makeup; Yafremava et al, 2013) This information can be used to display the trade-offs between the three persistence strategies in the triangle (Figure 3B). While Venn data suggest the protein vocabularies of Archaea are significantly compressed, similar patterns exhibited by Bacteria and Eukarya challenge the scaling relations of the domain probability distributions of Figure 2. Boxplots for the 786 universal ABE FSFs revealed a progression of median f -values for Archaea (f = 0.6), Bacteria (0.74), and Eukarya (0.90) This result again supports the effect of evolutionary reductive forces acting on both microbial superkingdoms and the significant apportionment of FFs in proteomes. The word cloud of lexical units of level 2 terms makes evident the functional activities associated with the two tendencies, one focusing on binding, enzymatic activities, transport, and regulation and the other on building higher-level structures with structural constituents and channels and regulatory and neurotransmitter activities

CONCLUSION AND PROSPECTS
Findings
DATA AVAILABILITY STATEMENT
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call