Amino acids are the building blocks of proteins and enzymes which are essential for life. Understanding amino acid usage offers insights into protein function and molecular mechanisms underlying life histories. However, genome-wide patterns of amino acid usage across domains of life remain poorly understood. Here, we analysed the proteomes of 5590 species across four domains and found that only a few amino acids are consistently the most and least used. This differential usage results in lower amino acid usage diversity at the most and least frequent ranks, creating a ubiquitous inverted U-shape pattern of amino acid diversity and rank which we call an ‘edge effect’ across proteomes and domains of life. This effect likely stems from protein secondary structural constraints, not the evolutionary chronology of amino acid incorporation into the genetic code, highlighting the functional rather than evolutionary influences on amino acid usage. We also tested other contemporary hypotheses regarding amino acid usage in proteomes and found that amino acid usage varies across life’s domains and is only weakly influenced by growth temperature. Our findings reveal a novel and pervasive amino acid usage pattern across genomes with the potential to help us probe deep evolutionary relationships and advance synthetic biology.
Read full abstract