Abstract

Next-generation sequencing (NGS) is widely used in genetic testing for the highly sensitive detection of single nucleotide changes and small insertions or deletions. However, detection and phasing of structural variants, especially in repetitive or homologous regions, can be problematic due to uneven read coverage or genome reference bias, resulting in false calls. To circumvent this challenge, a computational approach utilizing customized scaffolds as supplementary reference sequences for read alignment was developed, and its effectiveness demonstrated with two CBS gene variants: NM_000071.2:c.833T>C and NM_000071.2:c.[833T>C; 844_845ins68]. Variant c.833T>C is a known causative mutation for homocystinuria, but is not pathogenic when in cis with the insertion, c.844_845ins68, because of alternative splicing. Using simulated reads, the custom scaffolds method resolved all possible combinations with 100% accuracy and, based on > 60,000 clinical specimens, exceeded the performance of current approaches that only align reads to GRCh37/hg19 for the detection of c.833T>C alone or in cis with c.844_845ins68. Furthermore, analysis of two 1000 Genomes Project trios revealed that the c.[833T>C; 844_845ins68] complex variant had previously been undetected in these datasets, likely due to the alignment method used. This approach can be configured for existing workflows to detect other challenging and potentially underrepresented variants, thereby augmenting accurate variant calling in clinical NGS testing.

Highlights

  • Next-generation sequencing (NGS) is widely used in genetic testing for the highly sensitive detection of single nucleotide changes and small insertions or deletions

  • Variant identification is generally done by aligning sequence reads to a genomic reference in order to assess the presence of any sequence v­ ariations[1], and its analytical sensitivity in variant detection depends greatly on read mapping accuracy

  • To enable the detection and phasing of targeted structural variants using short reads, we developed a computational method that utilizes variant-specific customized reference sequences as scaffolds for read alignment

Read more

Summary

Introduction

Next-generation sequencing (NGS) is widely used in genetic testing for the highly sensitive detection of single nucleotide changes and small insertions or deletions. Detection and phasing of structural variants, especially in repetitive or homologous regions, can be problematic due to uneven read coverage or genome reference bias, resulting in false calls To circumvent this challenge, a computational approach utilizing customized scaffolds as supplementary reference sequences for read alignment was developed, and its effectiveness demonstrated with two CBS gene variants: NM_000071.2:c.833T>C and NM_000071.2:c.[833T>C; 844_845ins68]. To enable the detection and phasing of targeted structural variants using short reads, we developed a computational method that utilizes variant-specific customized reference sequences as scaffolds for read alignment This customization allows reads from different haplotypes to map correctly onto the corresponding reference scaffolds, resulting in improved genotype determination. These scaffolds can be constructed for any target region and incorporated into an existing variant analysis workflow without extensive algorithm development or increased analysis time

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call