Abstract

Short tandem repeats (STRs) are variable elements that play a significant role in genome evolution by creating and maintaining quantitative genetic variation. Because of their proximity to the +1 transcription start site (TSS) and polymorphic nature, core promoter STRs may be considered a novel source of variation across species. In a genome-scale analysis of the entire human protein-coding genes annotated in the GeneCards database (19,927), we analyze the prevalence and repeat numbers of different classes of core promoter STRs in the interval between −120 and +1 to the TSS. We also analyze the evolutionary trend of exceptionally long core promoter STRs of ≥6-repeats. 133 genes (~2%) had core promoter STRs of ≥6-repeats. In the majority of those genes, the STR motifs were found to be conserved across evolution. Di-nucleotide repeats had the highest representation in the human core promoter long STRs (72 genes). Tri- (52 genes), tetra-, penta-, and hexa-nucleotide STRs (9 genes) were also present in the descending prevalence. The majority of those genes (84 genes) revealed directional expansion of core promoter STRs from mouse to human. However, in a number of genes, the difference in average allele size across species was sufficiently small that there might be a constraint on the evolution of average allele size. Random drift of STRs from mouse to human was also observed in a minority of genes. Future work on the genes listed in the current study may further our knowledge into the potential importance of core promoter STRs in human evolution.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call