Abstract
Simple sequence repeats (SSRs) are used as polymorphic molecular markers in many species. They contribute very important functional variations in a range of complex traits; however, little is known about the variation of most SSRs in pig populations. Here, using genome resequencing data, we identified ~0.63 million polymorphic SSR loci from more than 100 individuals. Through intensive analysis of this dataset, we found that the SSR motif composition, motif length, total length of alleles and distribution of alleles all contribute to SSR variability. Furthermore, we found that CG-containing SSRs displayed significantly lower polymorphism and higher cross-species conservation. With a rigorous filter procedure, we provided a catalogue of 16,527 high-quality polymorphic SSRs, which displayed reliable results for the analysis of phylogenetic relationships and provided valuable summary statistics for 30 individuals equally selected from eight local Chinese pig breeds, six commercial lean pig breeds and Chinese wild boars. In addition, from the high-quality polymorphic SSR catalogue, we identified four loci with potential loss-of-function alleles. Overall, these analyses provide a valuable catalogue of polymorphic SSRs to the existing pig genetic variation database, and we believe this catalogue could be used for future genome-wide genetic analysis.
Highlights
Simple sequence repeats (SSRs) are tandem repeats with core motifs of 2 to 6 base pairs, which are widely distributed in both eukaryotic and prokaryotic genomes
We found that Chinese domestic pig breeds have generally higher heterozygous polymorphic SSRs (pSSRs) ratios than the western commercial lean pig breeds (Student’s t-test p value for median is 0.002287) (Fig. 6c), which was further supported by the group comparisons of heterozygous pSSR ratio and polymorphism information content (PIC) between Chinese domestic pigs and western lean pigs (Fig. 6d,e)
These novel variants are not included in dbSNP, probably because of the following reasons: (1) previous variant-calling procedures filtered out low-complexity repeat regions, and (2) popular software or pipelines focused on single-nucleotide polymorphisms (SNPs) calling and displayed poor performance in SSR variant calling
Summary
Simple sequence repeats (SSRs) are tandem repeats with core motifs of 2 to 6 base pairs (bp), which are widely distributed in both eukaryotic and prokaryotic genomes Because of their wide distribution, high level of polymorphism and co-dominant characters, SSRs are usually used as molecular markers for genetic mapping, population diversity and evolution studies. Various genome resequencing or de novo sequencing projects have been conducted in different pig breeds, and related data have been published and made available through GenBank[22,23,24,25,26,27,28] With these data, regions responding to domestication, specific breed characters and introgression were identified in a genome-wide scan based on high-throughput SNP analysis. With these polymorphic SSRs, we attempt to analyse some potential functional variations and evaluate their utility in population genetics
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.