Abstract
Nucleotide sequences have many properties of a language. This analogy when developed to its fullest results in an interesting linguistic description of the nucleotide sequences. Several a priori features of this language (called “Gnomic”) are discussed, based on its molecular nature. Gnomic appears to be a multicode language, with overlapping degenerate messages, each one encoding physically different specific interactions (protein-DNA, protein-RNA, protein-protein, and RNA-RNA). Several codes of the Gnomic language are discussed—the RNA-to-protein translation (triplet) code; the chromatin code responsible for DNA folding in chromatin; the framing code which secures correct triplet counting during translation; and, tentatively, the RNA loop code, presumably responsible for the formation of RNA loops with specific recognition sequences. The last code is aperiodic and involves mirror-symmetrical sequence elements, while the other codes are based on the sequence periodicities. A general technique of detection of words in continuous (no blanks) texts is discussed, based on the context contrast of the internally correlated strings.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.