Abstract
Multicellular eukaryotic genomes are replete with nonprotein coding sequences, both within genes (introns) and between them (intergenic regions). Excluding the well-recognized functional elements within these sequences (ncRNAs, transcription factor binding sites, intronic enhancers/silencers, etc.), the remaining portion is made up of so-called “dark” DNA, which still occupies the majority of the genome. This dark DNA has a profound nonrandomness in its sequence composition seen at different scales, from a few nucleotides to regions that span over hundreds of thousands of nucleotides. At the mid-range scale (from 30 up to 10,000 nt), this nonrandomness is manifested in base compositional extremes detected for each of four nucleotides (A, G, T, or C) or any of their combinations. Examples of such compositional nonrandomness are A-rich, purine-rich, or G+T-rich regions. Almost every combination of nucleotides has such enriched regions. We refer to these regions as being “inhomogeneous”. These regions are associated with unusual DNA conformations and/or particular DNA properties. In particular, mid-range inhomogeneous regions have complex arrangements relative to each other and to specific genomic sites, such as centromeres, telomeres, and promoters, pointing to their important role in genomic functioning and organization.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.