Abstract

We examine whether the frequency of amino acids across an organism's proteome is primarily determined by optimization to function or other factors, such as the structure of the genetic code. Considering all available proteins together, we first point out that the frequency of an amino acid in a proteome negatively correlates with its mass, suggesting that the genome preserves a fundamental distribution ruled by simple energetics. Given the universality of such distributions, one can use outliers, cysteine and leucine, to identify amino acids that deviate from this simple rule for functional purposes and examine those functions. We quantify the strength of such selection as the entropic cost outliers pay to defy the mass-frequency relation. Codon degeneracy of an amino acid partially explains the correlation between mass and frequency: light amino acids being typically encoded by highly degenerate codon families, with the exception of arginine. While degeneracy may be a factor in hard wiring the relationship between mass and frequency in proteomes, it does not provide a complete explanation. By examining extremophiles, we are able to show that this law weakens with temperature, likely due to protein stability considerations, thus the environment is essential.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call