Abstract
Short tandem repeats (STRs) are usually associated with genetic diseases and gene regulatory functions, and are also important genetic markers for analysis of evolutionary, genetic diversity and forensic. However, for the majority of STRs in the duck genome, their population genetic properties and functional impacts remain poorly defined. Recent advent of next generation sequencing (NGS) has offered an opportunity for profiling large numbers of polymorphic STRs. Here, we reported a population-scale analysis of STR variation using genome resequencing in mallard and Pekin duck. Our analysis provided the first genome-wide duck STR reference including 198,022 STR loci with motif size of 2–6 base pairs. We observed a relatively uneven distribution of STRs in different genomic regions, which indicates that the occurrence of STRs in duck genome is not random, but undergoes a directional selection pressure. Using genome resequencing data of 23 mallard and 26 Pekin ducks, we successfully identified 89,891 polymorphic STR loci. Intensive analysis of this dataset suggested that shorter repeat motif, longer reference tract length, higher purity, and residing outside of a coding region are all associated with an increase in STR variability. STR genotypes were utilized for population genetic analysis, and the results showed that population structure and divergence patterns among population groups can be efficiently captured. In addition, comparison between Pekin duck and mallard identified 3,122 STRs with extremely divergent allele frequency, which overlapped with a set of genes related to nervous system, energy metabolism and behavior. The evolutionary analysis revealed that the genes containing divergent STRs may play important roles in phenotypic changes during duck domestication. The variation analysis of STRs in population scale provides valuable resource for future study of genetic diversity and genome evolution in duck.
Highlights
Short tandem repeats (STRs), known as short sequence repeats(SSRs) or microsatellites, are tandem repeat nucleotides of 1–6 bp in DNA sequences
21 sequence data were generated in our previous work (Zhou et al, 2018) available at Genome Sequence Archive (GSA)1, and the other 28 previously published data were download from National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA)2
The AT motif was dramatically overrepresented in di-nucleotide motifs, and it was the most frequent motif in the entire duck genome, which accounting for 12.30% of the total SSR loci discovered
Summary
Short tandem repeats (STRs), known as short sequence repeats(SSRs) or microsatellites, are tandem repeat nucleotides of 1–6 bp in DNA sequences. These sequences are ubiquitously present in eukaryotic and prokaryotic genomes, and occur in both genic and intergenic regions (Toth et al, 2000). They are often highly variable with mutation rates dependent on several factors, including the STR motif length, repeat number, purity and their locations in the genome (Brandstrom and Ellegren, 2008; Payseur et al, 2011). The copy number variation of the “CCG” trinucleotide repeat identified in the promoter of Pleomorphic adenoma gene 1 (PLAG1), was proved to be the potential causative mutation influencing bovine stature by serving as nuclear factor binding sites and modulating the expression of PLAG1 (Karim et al, 2011)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.