Abstract

Biologists have long sought a way to explain how statistical properties of genetic sequences emerged and are maintained through evolution. On the one hand, non-random structures at different scales indicate a complex genome organisation. On the other hand, single-strand symmetry has been scrutinised using neutral models in which correlations are not considered or irrelevant, contrary to empirical evidence. Different studies investigated these two statistical features separately, reaching minimal consensus despite sustained efforts. Here we unravel previously unknown symmetries in genetic sequences, which are organized hierarchically through scales in which non-random structures are known to be present. These observations are confirmed through the statistical analysis of the human genome and explained through a simple domain model. These results suggest that domain models which account for the cumulative action of mobile elements can explain simultaneously non-random structures and symmetries in genetic sequences.

Highlights

  • Biologists have long sought a way to explain how statistical properties of genetic sequences emerged and are maintained through evolution

  • Our main empirical findings are: (i) Chargaff parity rule extends beyond the frequencies of short oligonucleotides; and (ii) Chargaff is not the only symmetry present in genetic sequences as a whole and there exists a hierarchy of symmetries nested at different structural scales

  • From a historical point of view, the symmetry was one of the key ingredients leading to the double-helix solution of the complicated genetic structure puzzle, demonstrating the fruitfulness of a unified study of symmetry and structure in genetic sequences

Read more

Summary

Introduction

Biologists have long sought a way to explain how statistical properties of genetic sequences emerged and are maintained through evolution. We unravel previously unknown symmetries in genetic sequences, which are organized hierarchically through scales in which non-random structures are known to be present These observations are confirmed through the statistical analysis of the human genome and explained through a simple domain model. While the mechanisms responsible for these observations have been intensively debated[4,5,6,7,8,9], several investigations indicate the patchiness and mosaic-type domains of DNA as playing a key role in the existence of large-scale structures[4,10,11] Another well-established statistical observation is the symmetry known as “Second Chargaff Parity Rule”[12], which appears universally over almost all extant genomes[13,14,15]. Domain models have been used to explain structures (e.g., the patchiness and long-range correlations in DNA), the significance of our results is that it indicates that the same biological processes leading to domains can explain the origin of symmetries observed in the DNA sequence

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.