Abstract

We study the size distribution of purine and pyrimidine clusters in coding and non-coding DNA sequences. We observe that the cluster-size distribution P( s) follows an exponential decay in coding sequences whereas it follows a power-law decay in non-coding sequences: P( s) ∼ s −1− μ , with a power exponent μ = 1.5–1.8. The mean-square displacement σ 2( m) is examined via a cluster walk model, with step-size distribution following P( s) and with m denoting the number of clusters covered by the walker. The behaviour of the mean-square displacement is σ 2( m) ∼ m 2/ μ for non-coding sequences and σ 2( m) ∼ m for coding sequences. We associate the power-law behaviour in the non-coding with the tendency of large Pu and Py cluster formation which dominate the non-coding. Under this observation the entire DNA sequence may be regarded as a collection of extended non-coding regions interrupted by small coding regions. We recall that this irregular composition of DNA, is of vital importance for the living organisms: Transposable elements and other “parasite” DNA which try to incorporate themselves into the DNA chain most probably intersect the large non-coding regions, thus leaving the organism unaffected, as is well known to biologists.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call