Abstract

Modern genome sequences are the products as a consequence of gene evolution. When we consider the gene evolution, it is necessary to investigate a universal feature of base sequences of the modern genes. We have investigated the characteristics of base sequences of modern genes using statistical analysis. In our previous works, we revealed that the base sequences of the modern genes universally have statistically significant repetitive short tuples [1]. These significant repetitive tuples (SRTs) exist not only in coding region but also in noncoding region and the lengths of SRTs are not multiple of the codon length. Each gene has a specific set of the base sequences of SRTs in the whole gene consisting of the coding and noncoding regions [2]. A set of genes encoding proteins with different functions contains few common SRTs. On the other hand, a gene family has a number of common SRTs. It is possible to say that the base sequences of SRTs are influenced by the codons used frequently in the coding region. If the base sequences of the repetitive short tuples are related to the sequences of codons, our results reflect the preference of codon usage. We have not yet studied the actual location of SRTs in the reading frame. Instead, in this work, we analysed distances of completely matched tuples to investigate whether or not the location of tuples with the same base sequence in coding region is related to the reading frame.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call