Abstract Tumor evolution is a highly heterogeneous process where multiple oncogenic pathways can lead to host defense evasion. Advancements in genome sequencing are allowing us to better understand these processes and their impact. However, critical genomic variations that could be pivotal to the development of tumors may have remained undetected due to technological constraints of short-read sequencing. PacBio HiFi sequencing generates accurate (>99.9%) long reads (>15 kb) with native 5-methylcytosine information that can comprehensively delineate many variations, including ones which were previously inaccessible with short-read sequencing. Unlike standard whole-genome analysis, cancer variant calling must account for somatic variation, where the assumptions of a diploid genome are violated. We have developed and optimized a bioinformatics workflow for somatic variant detection with HiFi sequence data. To validate this pipeline, we sequenced two well-studied cancer cell lines (COLO829, HCC1395) to 60-fold depth and their matched normal lymphoblast cells (40-fold to 60-fold depth) and called all classes of variations (SNVs, INDELS, and structural variants). Next, we compared our calls against the Valle-Inclan (structural variants) and SEQC2 (small variants: SNV/INDEL) datasets. We achieved 92% F1 score for small variants across all variant allele frequencies (VAF) in the HCC1395 experiment. We observed a maintained accuracy (F1=91%) even at lower variant frequencies (10-20), highlighting the ability of HiFi sequencing to detect subclonal mutations. The workflow calls structural variants with high sensitivity, capturing 57/62 (92%) of all previously validated SVs in COLO829 at all VAF. Lastly, we applied our pipeline to five diverse tumor samples that exhibit complex oncogenic phenotypes including homologous recombination deficiency (HRD) and high genomic instability, e.g. unresolved rearrangement breakpoints in repetitive regions and unknown oncogenic drivers when prior sequencing efforts did not provide full clarity as to the molecular drivers of tumorigenesis. Notably, we were able to identify and phase 5mC methylation of promoters together with large-scale structural variants in cancer genes involved in HRD. Taken together, we demonstrated the high accuracy of HiFi reads in resolving somatic variants and its potential to provide novel insights in complex cancer samples. The ability for long reads to resolve complex genomics aberrations and epigenetic markers that underly the tumorigenesis process will become indispensable in the realm of precision oncology. Citation Format: Khi Pin Chua, Ian McLaughlin, Oliver Hofmann, Kym Pham Stewart, Primo Baybayan, Jonathan Bibliowicz, Sean Grimmond, Zev Kronenberg, Michael A. Eberle. Somatic variant workflow with HiFi sequencing provides new insights in highly challenging cancer cases [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 2934.
Read full abstract