Abstract

Repeats or Transposable Elements (TEs) are highly repeated sequence stretches, present in virtually all eukaryotic genomes. We explore the distribution of representative TEs from all major classes in entire chromosomes across various organisms. We employ two complementary approaches, the scaling of block entropy and box-counting. Both converge to the conclusion that well-developed fractality is typical of small genomes while in large genomes it appears sporadically and in some cases is rudimentary. The human genome is particularly prone to develop this pattern, as TE chromosomal distributions therein are often highly clustered and inhomogeneous. Comparing with previous works, where occurrence of power-law-like size distributions in inter-repeat distances is studied, we conclude that fractality in entire chromosomes is a more stringent (thus less often encountered) condition. We have formulated a simple evolutionary scenario for the genomic dynamics of TEs, which may account for their fractal distribution in real genomes. The observed fractality and long-range properties of TE genomic distributions have probably contributed to the formation of the “fractal globule”, a model for the confined chromatin organization of the eukaryotic nucleus proposed on the basis of experimental evidence.

Highlights

  • In information theory, the notion of entropy was conceived by Shannon [1] to estimate the amount of information that is carried in a transmitted message

  • As we discuss later on, this property is in accordance with the model we propose for the generation of fractality and long-rangeness in genomes

  • As we implemented in our study, entropic values are computed with variable block length windows and fractality and self-similarity are estimated by the whole time series

Read more

Summary

Introduction

The notion of entropy was conceived by Shannon [1] to estimate the amount of information that is carried in a transmitted message. Scale invariance and fractality have been found in time series from signal transmission in electronic engineering, earthquakes, economy, social sciences and many other fields Very often, such studies have been carried out using the standard box-counting technique and in several cases of systems characterized by long range correlations Shannon entropy has been used. Several studies have shown that a linear scaling of the Shannon-like (or block) entropy H(n) with the length n of the word (called hereafter n-word) in semi-logarithmic plots is a clear indication of long-range order and fractality, as we are going to discuss [3,4,5,6] We verified this conjecture numerically in the case of finite Cantor-like symbol sequences [2]. We showed that the genomic distribution of protein coding segments often exhibits this particular scaling

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call