Abstract

The mutation process is a classical evolutionary genetic process. The type of mutations studied here is the random substitutions of a purine base R (adenine or guanine) by a pyrimidine base Y (cytosine or thymine) and reciprocally (transversions). The analytical expressions derived allow us to analyze in genes the occurrence probabilities of motifs and d-motifs (two motifs separated by any d bases) on the R/Y alphabet under transversions. These motif probabilities can be obtained after transversions (in the evolutionary sense; from the past to the present) and, unexpectedly, also before transversions (after back transversions, in the inverse evolutionary sense, from the present to the past). This theoretical part in Section 2 isa first generalization of a particular formula recently derived. The application in Section 3 is based on the analytical expression giving the autocorrelation function (the d motif probabilities) before transversions. It allows us to study primitive genes from actual genes. This approach solves a biological problem. The protein coding genes of chloroplasts and mitochondria have a preferential occurrence of the 6-motif YRY( N) 6YRY (maximum of the autocorrelation function for d = 6, N = R or Y) with a periodicity modulo 3. The YRY( N) 6YRY preferential occurrence without the periodicity modulo 3 is also observed in the RNA coding genes (ribosomal, transfer, and small nuclear RNA genes) and in the noncoding genes (introns and 5' regions of eukaryotic nuclei). However, there are two exceptions to this YRY( N) 6YRY rule: the protein coding genes of eukaryotic nuclei, and prokaryotes, where YRY( N) 6YRY has the second highest value after YRY( N) 0YRY (YRYYRY) with a periodicity modulo 3. When we go backward in time with the analytical expression, the protein coding genes of both eukaryotic nuclei and prokaryotes retrieve the YRY( N) 6YRY preferential occurrence with a periodicity modulo 3 after 0.2 back transversions per base. In other words, the actual protein coding genes of chloroplasts and mitochondria are similar to the primitive protein coding genes of eukaryotic nuclei and prokaryotes. On the other hand, this application represents the first result concerning the mutation process in the model of DNA sequence evolution we recently proposed. According to this model, the actual genes on the R/Y alphabet derive from two successive evolutionary genetic processes: an independent mixing of a few nonrandom types of oligonucleotides leading to genes called primitive followed by a mutation process in these primitive genes. Indeed, the mutation process can simulate statistical properties identified in genes, e.g., the variations between YRY( N) 0YRY and YRY( N) 6YRY, which could not have been so far simulated with the mixing process.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.