Abstract

The distribution of nucleotides in protein coding genes is studied with autocorrelation functions. The autocorrelation function YRY (N) iYRY, analysing the occurrence probability of the i-motif YRY(N) iYRY (two motifs YRYseparated by any ibases N, R= purine = Adenine or Guanine, Y= pyrimidine = Cytosine or Thymine, N = Ror Y) in the protein coding genes of eukaryotes, prokaryotes and viruses, reveals the classical periodicity 0 modulo 3 associated with the normal frame 0 (maximal values of the function at i= 0, 3, 6, etc). The specification of YRY(N) iYRY on the alphabet {A, C, G, T} leads to 64 i-motifs: CAC (N) i CAC, CAC (N) i CAT, ., TGT (N) i TGT. The 64 auto-correlation functions associated with these 64 i-motifs in protein coding genes have all the periodicity modulo 3, but, surprisingly, not always the expected periodicity 0 modulo 3. Two new types of periodicities are identified: a periodicity 1 modulo 3 associated with the shifted frame + 1 (maximal values of the function at i= 1, 4, 7, etc) and a periodicity 2 modulo 3 associated with the shifted frame - 1 (maximal values of the function at i= 2, 5, 8 etc). Furthermore, the classification of i-motifs according to the type of periodicity demonstrates a strong coherence relation between the 64 i-motifs, which is, in addition, common to the three gene populations, as the same i-motifs in the three gene populations have the same periodicities. The three periodicities 0, 1 and 2 modulo 3 can be simulated by an evolutionary model at two successive processes. The simulated genes are generated by a process of gene construction, with a stochastic automaton followed by a process of gene evolution with random insertions and deletions of trinucleotides simulating RNA editing. For almost all i-motifs, the autocorrelation functions in these simulated genes are strongly correlated with those in protein coding genes, for both the type and the probability level of periodicities. This paper describes the process of ribosomal frameshifting leading to the shifted periodicities, which may reveal overlapping genes or concatenated genes from different frames. It also presents the evolutionary aspects of the shifted periodicities. The shifted periodicities cannot be associated with the RNYmodel (Eigen & Schuster, 1978, Naturwissenschaften 65,341-369) or the RRYmodel (Crick et al.,1976, Origins of Life 7,389-397), but are compatible with the oligonucleotide mixing model (Arquès & Michel, 1990, Bull. math. Biol. 52,741-772). Finally, a variant of the primitive translation model of Crick et al.(1976) is proposed to explain the shifted periodicities.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.