Abstract

The Chaos Game is an algorithm that can allow one to produce pictures of fractal structures. Considering that the four bases A, G, C, and T of DNA sequences can be divided into three classes according to their chemical structure, we propose different kinds of CGR-walk sequences. Based on CGR coordinates of random sequences, we introduce some invariants for the DNA primary sequences. As an application, we can make the examination of similarity/dissimilarity among the first exon ofβ-globin gene of different species. The results indicate that our method is efficient and can get more biological information.

Highlights

  • A DNA sequence is comprised of four different nucleotides: adenine (A), cytosine (C), guanine (G), and thymine (T)

  • In order to numerically characterize a DNA sequence given by the Chaos Game Representation (CGR), we treat the hurst exponent as the efficient invariant that is sensitive to this kind of graphical representation

  • DNA sequences play an important role in modern biological research because all the information of the hereditary and species evolution is contained in these macromolecules

Read more

Summary

Introduction

A DNA sequence is comprised of four different nucleotides: adenine (A), cytosine (C), guanine (G), and thymine (T). Zhang [15] considered a DNA primary sequence termed as Z-curve Several researchers in their recent studies have outlined different kinds of graphical representation of DNA sequences based on 2D [16,17,18,19,20,21], 3D [22,23,24,25], 4D [26], 5D [27], and 6D [28] spaces. Gao and Xu [29] pointed out that the CGR-walk model can generate a model sequence and can be fitted with a long-memory ARFIMA (p, d, q) model reasonably They treated the four bases and ignored the hidden chemical classification of nucleotides. Abstract and Applied Analysis we make a comparison of the similarity and dissimilarity of the first exon of β-globin gene sequences derived from nine species

CGR-Walk Based on Three kinds of Classification and Primary Sequences
Numerical Characterization of DNA Sequences
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call