In this study, we propose a stutter ratio for a minus two base pair stutter (-2bpSR) model of the D1S1656 locus in capillary electrophoresis (CE)-based short tandem repeat (STR) typing. DNA from a total of 108 Japanese individuals was analyzed via massively parallel sequencing to investigate the length of the longest uninterrupted stretch of two base repeat motif (2bpLUS value) within repetitive structures involving the flanking region. Additionally, -2bpSR data was collected using the GlobalFiler Kit on a 3500xL Genetic Analyzer. As a result of sequencing analysis, all alleles were classified into two types by their 2bpLUS values. The -2bpSR differed significantly between the types. Then, we modeled the -2bpSR with a mixture log-normal distribution using the classification of alleles based on the 2bpLUS values. Furthermore, probabilities of the sequence type within each repeat number in the mixture log-normal distribution model were estimated using logistic regression for each of the five major detected populations. This study is expected to enable interpretation of STR typing while considering minus two base pair stutter at the D1S1656 locus.
Read full abstract