Abstract

Circular codes are putative remnants of primeval comma-free codes and are potentially involved in detecting and maintaining the normal reading frame in protein coding sequences. In Michel and Pirillo (2013a) it was shown by computer algorithm that no maximal trinucleotide circular code can encode more than 18 different amino acids under the standard version of the genetic code. For comma-free codes the maximum is even less, namely 13 (Michel, 2014). The main purpose of this paper is to investigate these facts from a mathematical point of view and to show why the codes with the best-known error detecting properties are limited in the number of amino acids they can encode. We introduce five hierarchically ordered classes of trinucleotide codes including the well-known comma-free and circular codes and prove combinatorically that it is impossible to encode all amino acids using codes from four out of the five classes that have the strongest error detecting properties. However, it is possible to encode all 20 amino acids using codes from the largest class with the weakest properties. Additionally, we develop a handy criterion for circularity. As an application, it is shown that all codes from a special class of trinucleotide codes which includes the RNY-primeval code (Shepherd, 1986) are automatically circular. We also list which amino acids these codes encode.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.