Sequence polymorphisms were characterized at 27 autosomal STRs (A-STRs), 7 X chromosomal STRs (X-STRs), and 24 Y chromosomal STRs (Y-STRs) in 635 Northern Han Chinese with the ForenSeq DNA Signature Prep Kit on the MiSeq FGx Forensic Genomics System. Since repeat region (RR) and flanking region (FR) variation can be detected by massively parallel sequencing (MPS), the increase in the number of unique alleles and the average of gene diversity was 78.18% and 3.51% between sequence and length, respectively. A total of 74 novel RR variants were identified at 33 STRs compared with STRSeq and previous studies, and 13 FR variants (rs1770275883, rs2053373277, rs2082557941, rs1925525766, rs1926380862, rs1569322793, rs2051848492, rs2051848696, rs2016239814, rs2053269960, rs2044518192, rs2044536444, and rs2089968964) were first submitted to dbSNP. Also, 99.94% of alleles were concordant between the ForenSeq DNA Signature Prep Kit and commercial CE kits. Discordance resulted from the low performance at D22S1045 and occasionally at DYS392, flanking region deletions at D7S820 and DXS10074, and the strict alignment algorithm at DXS7132. Null alleles at DYS505 and DYS448 and multialleles at DYS387S1a/b, DYS385a/b, DYS448, DYS505, DXS7132, and HPRTB were validated with other MPS and CE kits. Thus, a high-resolution sequence-based (SB) and length-based (LB) allele frequencies dataset from Northern Han Chinese has been established already. As expected, forensic parameters increased significantly on combined power of discrimination (PD) and combined power of exclusion (PE) at A-STRs, mildly on combined PD and combined mean exclusion chance (MEC) at X-STRs, and barely on discrimination capacity (DC) at Y-STRs. Additionally, MiSeq FGx quality metrics and MPS performance were evaluated in this study, which presented the high-quality of the dataset at 20 consecutive runs, such as ≥ 60% bases with a quality score of 20 or higher (%≥ Q20), > 60% of effective reads, > 2000 × of depth of coverage (DoC), ≥ 60% of allele coverage ratio (ACR) or heterozygote balance, ≥ 70% of inter-locus balance, and ≤ 0.4 of the absolute value of observed minus expected heterozygosity (|Hexp – Hobs|). In conclusion, MiSeq FGx can help us generate a high-resolution and high-quality dataset for human identification and population genetic studies.
Read full abstract