Abstract

Intra-tumor heterogeneity renders the identification of somatic single-nucleotide variants (SNVs) a challenging problem. In particular, low-frequency SNVs are hard to distinguish from sequencing artifacts. While the increasing availability of multi-sample tumor DNA sequencing data holds the potential for more accurate variant calling, there is a lack of high-sensitivity multi-sample SNV callers that utilize these data. Here we report Moss, a method to identify low-frequency SNVs that recur in multiple sequencing samples from the same tumor. Moss provides any existing single-sample SNV caller the ability to support multiple samples with little additional time overhead. We demonstrate that Moss improves recall while maintaining high precision in a simulated dataset. On multi-sample hepatocellular carcinoma, acute myeloid leukemia and colorectal cancer datasets, Moss identifies new low-frequency variants that meet manual review criteria and are consistent with the tumor’s mutational signature profile. In addition, Moss detects the presence of variants in more samples of the same tumor than reported by the single-sample caller. Moss’ improved sensitivity in SNV calling will enable more detailed downstream analyses in cancer genomics.

Highlights

  • Intra-tumor heterogeneity renders the identification of somatic single-nucleotide variants (SNVs) a challenging problem

  • To understand mechanisms of tumorigenesis and devise personalized treatment plans, it is important to fully characterize the extent of intra-tumor heterogeneity of a cancer. This begins with accurately calling the set of somatic mutations that are present in a tumor, which is the first step in many important downstream analyses in cancer genomics, including identifying mutations that drive cancer progression[2], reconstructing the evolutionary history of a tumor[3,4], predicting the response to immunotherapy[5], identifying mutational signatures[6,7] and reconstructing repeated patterns of cancer evolution[8] and metastasis[9]

  • Moss extracts the set of candidate loci by taking the union of the positions of all the SNV records in the VCF files, as well as the normal alleles inferred by the base caller

Read more

Summary

Introduction

Intra-tumor heterogeneity renders the identification of somatic single-nucleotide variants (SNVs) a challenging problem. Current single-sample SNV callers are unable to unlock the potential of these data, which enable more accurate variant calling because the probability of the same sequencing error occurring in all tumor samples at the same locus decreases significantly with an increasing number of tumor samples (Fig. 1a). We propose Moss, a somatic SNV caller that leverages the additional information afforded by multi-sample tumor data to enable improved sensitivity SNV calling compared to existing single-sample somatic SNV callers. Our simulations show that running Moss in conjunction with Mutect[2] (single-sample mode) or Strelka[2] accurately recovers variants with low VAF, thereby increasing recall while maintaining high precision.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call