Abstract

3077 Background: With the spread of NGS in routine clinical practice data analysis methods actively evolve to increase and refine information yield. Not much attention is given to variant allele fraction (VAF) estimation though it impacts clinical insignificance of subclonal somatic variants and identification of variant origin. Methods: Real-world sequencing data of 2379 samples (including 781 tumor-only and 1598 blood-only sequencing datasets) obtained via amplicon-based sequencing was used for the retrospective analysis of variant allele frequency estimation of identified genetic variants employing different methods including ITVC, SiNVICT, Mutect2 and SGA. Original method (AODvAF) utilizing alignment-free algorithm based on wild-type and alternative genome subsequence abundance estimation was developed and compared with the original output from NGS data analysis software. Anticipated VAF estimation was based on analyte type and database annotation. Results: Across 3447 identified potentially germline variations (PGV) VAF estimations provided by ITVC, SiNVICT and SGA were highly correlated (Pearson correlation coefficient [PCC] 1.00), while Mutect2 and AODvAF demonstrated discordant results (Mutect2/AODvAF vs ITVC PCC 0.87/0.82). VAF miscalculation defined by absolute difference between estimated and anticipated VAF of 10% and more was noted for 25%, 26%, 22%, 15% and 7% for ITVC, SiNVICT, SGA, Mutect2 and AODvAF respectively. Across 308 pathogenic BRCA1, BRCA2, PALB2 mutations identified in blood-only sequencing 25, 15, 16, 10 and 5 were characterized by VAF of 30% and lower based on ITVC, SiNVICT, SGA, Mutect2 and AODvAF estimations respectively (54, 49, 33, 51 and 9 for VAF 40% and lower), which may lead to misinterpretation of these variants as post-zygotic. Automatic in-silico annotation of variant origin (germline vs somatic) based on ISOWN was correct for 89%, 91%, 94%, 93% and 98% of these variants. Variant origin for 55 pathogenic variants of genes associated with hereditary cancer syndromes identified in tumor-only sequencing was verified by Sanger sequencing. Cohen's kappa coefficient for variant origin discrimination (Sanger vs in-silico prediction) was 0.91, 0.90, 0.91, 0.94 and 0.97 based on VAF provided by ITVC, SiNVICT, SGA, Mutect2 and AODvAF respectively. Conclusions: Accurate VAF calculation allows precise variant origin identification and impacts interpretation of variants. Developed method allows correction of VAF after NGS data core analysis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call