Abstract

Short tandem repeats (STRs) are particularly difficult to genotype with rapid evolving next-generation sequencing (NGS) technology. Long amplicons containing repetitive sequences result in alignment and genotyping errors. Stutters arising from polymerase slippage often result in reads with additional or missing repeat copies. Many tools are available for analysis of STR markers from NGS data. This study has evaluated the concordance of the HipSTR, STRait Razor, and toaSTR tools for STR genotype calling; NGS data obtained from a highly genetically diverse Brazilian population sample have been used. We found that toaSTR can retrieve a larger number of genotypes (93.8%), whereas HipSTR (84.9%) and STRait Razor present much lower genotype calling (75.3%). Accuracy levels for genotype calling are very similar (identical genotypes ~95% and correct alleles ~ 97.5%) across the three methods. All the markers presenting the same genotype through the methods are in Hardy–Weinberg equilibrium. We found that combined match probability and combined exclusion power are 2.90 × 10−28 and 0.99999999982, respectively. Although toaSTR has varying locus-specific differences and better overall performance of toaSTR, the three programs are reliable genotyping tools. Notwithstanding, additional effort is necessary to improve the genotype calling accuracy of next-generation sequencing datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call