Abstract

Abstract With the advent of next-generation sequencing technologies, a variety of structural variant (SV) calling algorithms have been developed. However, these algorithms are generally not sensitive and specific, in part because they are biased toward identification of specific types or lengths of SVs. Thus, concordance among algorithms is often very low. At the National Institute of Standards and Technology (NIST), we have previously rigorously characterized sequencing biases across multiple sequencing platforms for the candidate NIST reference material, RM 8398 (the genome of individual NA12878). This yielded a high-confidence set of SNP and small indel (<40 bp) variant calls. To extend our methods to SVs, we have developed SVClassify, which classifies SVs as likely true or false positive variants by combining evidence from one or more sequencing datasets. For NA12878, we were able to separate a set of validated deletions from random genomic regions with false positive and false negative rates less than 5%. We also found that the set of validated deletions clustered into different categories, including heterozygous Alu deletions, homozygous Alu deletions, and other heterozygous deletions. Until now, SVClassify has only been used for classification of large deletions on the non-cancerous genome, NA12878. In order to assess the sensitivity and specificity of SVClassify for correctly classifying translocations, we are using the RSVSim R package to simulate different numbers of translocations within repeat regions of the human genome. Furthermore, we will investigate the performance of SVClassify on cancer genomes, particularly pediatric solid tumors, which exhibit extensively rearranged genomes compared to their normal counterparts. To do so, pediatric tumor and normal datasets from multiple sequencing technologies will be integrated. To represent a spectrum of translocations, we will assess variant calls from cancer genomes with varying degrees of rearrangements: neuroblastoma, Ewing sarcoma, and osteosarcoma. Finally, we will use SVclassify to classify candidate SV calls made using the Complete Genomics pipeline, as well as those made on Illumina datasets using the recently developed SMUFIN algorithm. Citation Format: Jo Lynne Harenza, Hemang M. Parikh, Jun S. Wei, Xinyu Wen, Sivasish Sindiri, Rajesh Patidar, Marc Salit, Paul S. Meltzer, Javed Khan, Justin Zook. Use of the SVClassify algorithm to classify pediatric solid tumor translocation variant calls as likely true or false positives. [abstract]. In: Proceedings of the 106th Annual Meeting of the American Association for Cancer Research; 2015 Apr 18-22; Philadelphia, PA. Philadelphia (PA): AACR; Cancer Res 2015;75(15 Suppl):Abstract nr 1077. doi:10.1158/1538-7445.AM2015-1077

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call