Abstract

The discovery of single nucleotide variants (SNVs) from next-generation sequencing (NGS) data typically works by aligning reads to a given genome and then creating an alignment map to interpret the presence of SNVs. Various approaches have been developed to call whether germline SNVs (or SNPs) in normal cells or somatic SNVs in cancer/tumor cells. Nonetheless, efficient callers for both germline and somatic SNVs have not yet been extensively investigated. In this paper, we present SNVSniffer, an integrated caller for germline and somatic SNVs from NGS data based on Bayesian probabilistic models. In SNVSniffer, our germline SNV calling models allele counts per site as a multinomial conditional distribution. Meanwhile, our somatic SNV calling relies on NGS tumor-normal sample pairs, and introduces a hybrid approach combining a subtraction approach with a joint sample analysis which models tumor-normal allele counts per site as a joint multinomial conditional distribution. Moreover, we investigate a lightweight tumor purity estimation approach, which demonstrates high accuracy on synthetic tumors. Compared to some leading SNP callers (SAMtools, GATK and FaSD) and somatic SNV callers (VarScan2, SomaticSniper, JointSNVMix2, MuTect), SNVSniffer demonstrates comparable or even better accuracy at faster speed. SVNSniffer, the synthetic tumor-normal data and the supplementary information are available at http://snvsniffer.sourceforge.net.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call