ABSTRACTMicrosatellites or simple sequence repeats (SSRs) are prevalent across various organisms' genomes. However, their distribution patterns and evolutionary dynamics in reptile genomes are rarely studied systematically. We herein conducted a comprehensive analysis of SSRs in the genomes of 36 reptile species. Our findings revealed that the total number of SSRs ranged from 1,840,965 to 7,664,452, accounting for 2.16%–8.19% of the genomes analyzed. The relative density ranged from 21,567.82 to 81,889.41 bp per megabase (Mbp). The abundance of different SSR categories followed the pattern of imperfect SSR (I‐SSR) > perfect SSR (P‐SSR) > compound SSR (C‐SSR). A significant positive correlation was observed between the number of SSRs and genome size (p = 0.0034), whereas SSR frequency (p = 0.013) or density (p = 0.0099) showed a negative correlation with genome size. Furthermore, no correlation was found between SSR length and genome size. Mononucleotide repeats were the most common P‐SSRs in crocodilians and turtles, whereas mononucleotides, trinucleotides, or tetranucleotides were the most common P‐SSRs in snakes, lizards, and tuatara. P‐SSRs of varying motif sizes showed nonrandom distribution across different genic regions, with AT‐rich repeats being predominant. The genomic SSR content of the squamate lineage ranked the highest in abundance and variability, whereas crocodilians and turtles showed a slowly evolving and reduced microsatellite landscape. Gene ontology enrichment and Kyoto Encyclopedia of Genes and Genomes pathway analyses indicated that genes harboring P‐SSRs in the coding DNA sequence regions were primarily involved in the regulation of transcription and translation processes. The SSR dataset generated in this study provides potential candidates for functional analysis and calls for broader‐scale analyses across the evolutionary spectrum.
Read full abstract