Abstract

MiST is a novel approach to variant calling from deep sequencing data, using the inverted mapping approach developed for Geoseq. Reads that can map to a targeted exonic region are identified using exact matches to tiles from the region. The reads are then aligned to the targets to discover variants. MiST carefully handles paralogous reads that map ambiguously to the genome and clonal reads arising from PCR bias, which are the two major sources of errors in variant calling. The reduced computational complexity of mapping selected reads to targeted regions of the genome improves speed, specificity and sensitivity of variant detection. Compared with variant calls from the GATK platform, MiST showed better concordance with SNPs from dbSNP and genotypes determined by an exonic-SNP array. Variant calls made only by MiST confirm at a high rate (>90%) by Sanger sequencing. Thus, MiST is a valuable alternative tool to analyse variants in deep sequencing data.

Highlights

  • Whole-exome sequencing (WES), mRNA-seq and wholegenome sequencing are amongst several commonly used techniques based on deep-sequencing that allow extensive sampling of the genome to detect variants

  • Using Sanger sequencing of the data from trios, we found that the false positives predominated when the coverage was below 15 and/or the minor allele frequency dropped below 0.3, which is our threshold for variant calling

  • We identify variants in our list that occur in dbSNP, 1000 Genomes or private collections to highlight the novel variants that are of interest in most studies of rare genetic disorders

Read more

Summary

Introduction

Whole-exome sequencing (WES), mRNA-seq and wholegenome sequencing are amongst several commonly used techniques based on deep-sequencing that allow extensive sampling of the genome to detect variants. There are several software pipelines that analyse the data from WES including GATK [2,3], Samtools [4], Freebayes [5] and Bambino [6] Their approach involves mapping the sequences to the reference genome to generate BAM/SAM files. These alignment files are subsequently analysed to infer variants and SNPs [7]. A subsequent fine-mapping step, which aligns the selected reads against the exonic regions, permits a more sensitive and accurate identification of SNPs and variants. This approach reduces the computational complexity and allows for more sensitive mapping

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.