Abstract

In the 1950s, Crick proposed the concept of so-called comma-free codes as an answer to the frame-shift problem that biologists have encountered when studying the process of translating a sequence of nucleotide bases into a protein. A little later it turned out that this proposal unfortunately does not correspond to biological reality. However, in the mid-90s, a weaker version of comma-free codes, so-called circular codes, was discovered in nature in J Theor Biol 182:45–58, 1996. Circular codes allow to retrieve the reading frame during the translational process in the ribosome and surprisingly the circular code discovered in nature is even circular in all three possible reading-frames (C^3-property). Moreover, it is maximal in the sense that it contains 20 codons and is self-complementary which means that it consists of pairs of codons and corresponding anticodons. In further investigations, it was found that there are exactly 216 codes that have the same strong properties as the originally found code from J Theor Biol 182:45–58. Using an algebraic approach, it was shown in J Math Biol, 2004 that the class of 216 maximal self-complementary C^3-codes can be partitioned into 27 equally sized equivalence classes by the action of a transformation group L subseteq S_4 which is isomorphic to the dihedral group. Here, we extend the above findings to circular codes over a finite alphabet of even cardinality |Sigma |=2n for n in {mathbb {N}}. We describe the corresponding group L_n using matrices and we investigate what classes of circular codes are split into equally sized equivalence classes under the natural equivalence relation induced by L_n. Surprisingly, this is not always the case. All results and constructions are illustrated by examples.

Highlights

  • Crick et al (1957) proposed a class of trinucleotide codes— called comma-free codes—as nature’s key to avoid errors when translating the genetic code

  • We prove that dinucleotide circular codes are divided into sized equivalence classes due to the action of Ln and the same holds true for the general class of -maximum circular Cl-codes over general alphabets

  • Classes of l-letter codes over general alphabets Σ have been investigated with respect to their behaviour under the natural action of a specific subgroup L of the symmetric group SΣ acting on the letters of the alphabet

Read more

Summary

Introduction

Crick et al (1957) proposed a class of trinucleotide codes— called comma-free codes—as nature’s key to avoid errors when translating the genetic code. The following theorem gets to the bottom of the problem and shows that for any word and alphabet cardinality for the classes of maximal (self-complementary) strong comma-free codes, some Ln-induced equivalence classes are truly smaller than the order of Ln. The result is at most general, since it applies to any l-letter ( l ≥ 1 ) words. In Lemma 4.2, it is shown that the number of maximal c-self-complementary strong comma-free diletter codes over Σ is equal to 2n This number cannot be divided by the order of the group |Ln| = n!2n , there must be an equivalence class of size strictly smaller than that of Ln. Again we fix ∈ Ln , so recall that c◦ = ◦c. Proof In order to show that Ln induces equivalence classes of sizes |Ln| when acting on the mentioned class C of codes, we show that the size of Ln is not a divisor of the number of such codes

The number of maximal comma-free diletter codes over Σ is
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call