The Standard Genetic Code (SGC) is a mapping between 64 codons and 20 natural amino acids plus a stop signal. The SGC is commonly considered to have been developed to lessen the consequences of translational blunders, such as the charging of erroneous amino acids during protein synthesis or a stop codon encounter. The Circular code X consists of 20 codons prevalent in the protein-coding genes of most species, including viruses, archaea, plasmids, bacteria, and eukaryotes. The Circular code X has remarkable error-correcting property, and its motifs are significantly abundant in genes, allowing for the detection and preservation of the proper reading frame. In this work, the hypothesis that the RNY Comma-free code is better optimized than the Circular code X at reducing the consequences of frameshift problems is put forward. The RNY code (a Self-complementary Comma-free code) consists of 16 codons observed statistically in gene sequences, where R stands for Purine and Y for Pyrimidine, and it has a nice error-correcting feature concerning the frameshift issue. We have employed the two previously developed score measures: code score and di-codon score, and then estimated the optimality of different codes based on the numerous physicochemical properties across all plausible frameshift mistakes. Later, we show that the RNY Comma-free code is better optimized than the Circular code X at reducing the effects of frameshift faults. Furthermore, some previously established results led us to infer that Circular code X is better optimized than the SGC in terms of reducing the impacts of frameshift errors. Accordingly, in terms of optimality following frameshift faults, the RNY Comma-free code outperforms the SGC. Our results support the Crick’s assertion that genetic codes evolved from archaic Comma-free codes, which may be more error-tolerant and hence more robust.
Read full abstract