Abstract

The specific characteristics of k-mer words (2 ≤ k ≤ 11) regarding genomic distribution and evolutionary conservation were recently found. Among them are, in high abundance, words with a tandem repeat structure (repeat unit length of 1 bp to 3 bp). Furthermore, there seems to be a class of extremely short tandem repeats (≤12 bp), so far overlooked, that are non-random-distributed and, therefore, may play a crucial role in the functioning of the genome. In the following article, the positional distributions of these motifs we call super-short tandem repeats (SSTRs) were compared to other functional elements, like genes and retrotransposons. We found length- and sequence-dependent correlations between the local SSTR density and G+C content, and also between the density of SSTRs and genes, as well as correlations with retrotransposon density. In addition to many general interesting relations, we found that SINE Alu has a strong influence on the local SSTR density. Moreover, the observed connection of SSTR patterns to pseudogenes and -exons might imply a special role of SSTRs in gene expression. In summary, our findings support the idea of a special role and the functional relevance of SSTRs in the genome.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.