Y-chromosome short tandem repeat (Y-STR) is an important type of genetic markers in the human genome, widely used in molecular anthropology and forensic genetics. However, most Y-STR studies has been focused on the length-based variations resulting from differences in the number of repeat units. Less attention was paid to sequence-based Y-STR variations. Consequently, sequence-based variation characteristics of Y-STRs in Chinese populations remain insufficiently studied. In this study, targeted sequencing of 42 Y-STR loci was performed for 331 Chinese Han males (with an average sequencing depth of 612 ×), unveiling a total of 387 sequence allele types and their frequencies in the population. Repeat pattern variations were observed in seven loci containing multiple repeat units. Across all sequenced repeat and flanking regions, 46 single-nucleotide substitutions and insertion/deletion variations were identified, including 13 mutations not recorded in the dbSNP database. Twenty-seven previously unreported sequence-based alleles were identified. Additionally, differences in Y-STRs between the Chinese Han population and three American populations (African Americans, Caucasians, and Hispanics) were revealed from sequence-based data analysis. In summary, this study provides a detailed summary of the sequence features of 42 Y-STRs in the Chinese Han population, improving our understanding of Y-STRs and providing basic data of sequence variations for the application of Y-STRs.
Read full abstract