Abstract

Abstract Studies have shown that somatic structural variation (SV) plays a key role in the oncogenic process. Traditionally SVs in the cancer genome have been detected using low resolution cytogenetic approaches, such as FISH, or microarray-based techniques. More recently, next-generation sequencing (NGS)-based technologies have been employed to detect SVs, including indels and translocations. However, both short- and long-read NGS-based approaches are limited in their ability to accurately identify SV events and delineate their breakpoints due to the limitations inherent in assembly of billions of short-read sequences across a heterogeneous cancer sample, as well as the costly and burdensome laboratory infrastructure associated with long-read sequencers. We utilized a novel technology that combines microfluidics and molecular barcoding to generate libraries that are sequenced with an Illumina system. Open-source bioinformatics software produces linked-reads that maintain long-range information and single molecule sensitivity. Cell lines and cancer samples were obtained from commercial sources, and genomic DNA was extracted. DNA sample indexing and partitioning was performed using the 10X Genomicx GemCode instrument. One ng of sample DNA was used as input for each reaction, and DNA molecules were partitioned into droplets to fragment the DNA and introduce molecular barcodes. Following barcoding, droplets were fractured, and library DNA was purified and sequenced on Illumina sequencers. The GemCode Long Ranger software suite was used to map sequencing reads back to original long molecules of DNA, generating reads linked to partition barcodes. Thus we can generate phased sequences covering many 10's to 100's of kilobases. We first benchmarked the ability to call multiple SV types using a well-characterized germline HapMap sample (NA12878) as well as two recently characterized haploid hydatidiform moles (CHM1 and CHM13) that have been studied with multiple orthogonal technologies. Regions with evidence for structural variation were reassembled into distinct haplotypes. The barcode information allowed us to both phase the structural variants we detected and disambiguate calls within highly repetitive regions, such as segmental duplications. We demonstrated high concordance with alternative approaches across all major classes of SVs, including long insertions and deletions as well as copy-neutral events. In cancer cell lines, we detected well-annotated gene fusions, such as the EML4/ALK and ALK/PTPN3 fusions in the lung cancer cell line NCI-H2228, and the SLC26A/PRKAR2A fusion in the triple negative breast cancer cell line HCC38. Citation Format: Sofia Kyriazopoulou-Panagiotopoulou, Patrick Marks, Haynes Heaton, Heather Ordonez, Kristina Giorda, Cassandra Jabara, Billy Lau, John M. Bell, Michael Schnall-Levin, Hanlee P. Ji. Linked-Reads enable detailed, phased resolution of structural variation in the cancer genome. [abstract]. In: Proceedings of the 107th Annual Meeting of the American Association for Cancer Research; 2016 Apr 16-20; New Orleans, LA. Philadelphia (PA): AACR; Cancer Res 2016;76(14 Suppl):Abstract nr 3602.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call