Abstract
A set of 1036 U.S. Population Samples were sequenced using the Illumina ForenSeq DNA Signature Prep Kit. This sample set has been highly characterized using a variety of marker systems for human identification. The FASTQ files obtained from a ForenSeq DNA Signature Prep Kit experiment include several STR loci that are not reported in the associated software. These include SE33, DXS8377, DXS10148, DYS456, and DYS461. The sequence variation within the autosomal STR marker SE33 was evaluated using a customized bioinformatic approach to identify and characterize the locus in the 1036 data set. The analysis identified 53 unique alleles by length and 264 by sequence. An additional 10 alleles were detected when selected extended flanking regions were examined to resolve discordances. Allele frequencies and SE33 sequence motif patterns are reported for the 1036 data set. The comparison of numerical allele calls derived from sequence data to the allele calls obtained from commercial capillary electrophoresis-based STR typing kits resulted in 100% concordance, after manual data review and confirmation sequencing of three flanking region deletions. The analysis of this data set involved significant manual sequence curation and information support from length-based genotypes to ensure high confidence in the sequence-based allele calls. The challenges of interpreting the sequence data for SE33 consisted of high sequence noise, allele-size dependent variance in coverage, and heterozygote imbalance. As allele length increased, sequence depth of coverage and quality decreased at the terminal end. Accordingly, heterozygous genotype imbalance increased in proportion to increased distance between alleles.
Accepted Version (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.