Abstract Here, we detail in-progress genome-scale measurements from a variety of technologies of the first tumor normal benchmark from the Genome in a Bottle (GIAB) consortium. We created the first broadly-consented tumor cell line from a pancreatic ductal adenocarcinoma with matched normal pancreatic and duodenal tissue. Data is being collected from a large homogeneous batch of the tumor cell line and paired normal tissues. As we receive data we make it publicly available on the NCBI hosted GIAB FTP site, https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data_somatic/. When complete, this dataset will include WGS measurements from Illumina, ONT, Hi-C, Bionano, single cell WGS, PacBio HiFi and Onso, and Element. Analysis and development of a variant calling benchmark using this data is ongoing in collaboration with an open working group of the GIAB consortium. Initial Hi-C and optical mapping data from the tumor cell line indicates substantial aneuploidy from translocations that cause large deletions. We found roughly 17 large inversions and translocations and 16 chromosomes with extensive loss of heterozygosity due to missing >30% of one copy, in addition to a few smaller duplications. Low coverage single cell sequencing that provides ploidy estimates across chromosomes showed that most of the large deletions appear in all or nearly all cells with some variation in a few cells. These observations are consistent with bulk WGS analyses from two batches of cells. Additionally, many of the observed large deletions correspond to deletions seen in the population of TCGA chromosomal Loss of Heterogeneity samples. Examining somatic SNVs in non-repetitive regions, we find that close to 60% occur in almost all cells in diploid regions, 30% occur in almost all cells in haploid regions, and 5% in only some cells in diploid and haploid regions, respectively. We take these results to indicate the cell line is relatively homogeneous and stable with most CNVs being deletions. As such, we plan to explore using long reads, ultralong reads, and Hi-C data to generate a near-complete genome assembly of the dominant tumor clone as well as a complete diploid assembly of the normal. With this personalized genome assembly, we will explore aligning tumor and normal reads to each haplotype of the normal to characterize somatic variants, including variants in minor clones. Building on the methods GIAB and T2T have developed to polish diploid assemblies of GIAB’s normal genomes and mosaic variant characterization, we will develop benchmarks for somatic variants against the normal assembly as well as standard references. Citation Format: Justin Wagner, Jennifer McDaniel, Gail L. Rosen, Nathan D. Olson, Vaidehi Patel, Chunlin Xiao, Andrew Liss, Justin Zook. Continued analysis of extensive data towards Genome in a Bottle benchmarks for a new tumor normal pair [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 3551.
Read full abstract