Abstract

Oncogene amplification, a major driver of cancer pathogenicity, is often mediated through focal amplification of genomic segments. Recent results implicate extrachromosomal DNA (ecDNA) as the primary driver of focal copy number amplification (fCNA) - enabling gene amplification, rapid tumor evolution, and the rewiring of regulatory circuitry. Resolving an fCNA’s structure is a first step in deciphering the mechanisms of its genesis and the fCNA’s subsequent biological consequences. We introduce a computational method, AmpliconReconstructor (AR), for integrating optical mapping (OM) of long DNA fragments (>150 kb) with next-generation sequencing (NGS) to resolve fCNAs at single-nucleotide resolution. AR uses an NGS-derived breakpoint graph alongside OM scaffolds to produce high-fidelity reconstructions. After validating its performance through multiple simulation strategies, AR reconstructed fCNAs in seven cancer cell lines to reveal the complex architecture of ecDNA, a breakage-fusion-bridge and other complex rearrangements. By reconstructing the rearrangement signatures associated with an fCNA’s generative mechanism, AR enables a more thorough understanding of the origins of fCNAs.

Highlights

  • Oncogene amplification, a major driver of cancer pathogenicity, is often mediated through focal amplification of genomic segments

  • We present a computational method for reconstructing large complex focal copy number amplification (fCNA), AmpliconReconstructor (AR)

  • A scaffolding module, which takes a collection of breakpoint graph segments aligned to optical mapping (OM) contigs as input and creates scaffolds represented by directed acyclic graphs (DAGs) (Fig. 1c–e, Methods—“Reconstructing amplicon paths with AR”)

Read more

Summary

Introduction

A major driver of cancer pathogenicity, is often mediated through focal amplification of genomic segments. Fewer methods are available to handle the more difficult problem of ordering and orienting multiple genomic segments joined by breakpoints into high-confidence copy number-aware scaffolds, which are subsequently joined to enable complete reconstructions of complex rearrangements[6,25]. This problem represents the key algorithmic challenge addressed by our work. In practice, path/cycle extraction is often confounded by duplications of large genomic regions inside an amplicon (Supplementary Fig. 1a), imperfections in the graph arising from errors in estimation of segment copy numbers, erroneous and/or missing breakpoints. The integrated NGS data and OM data provide an orthogonal pairing of shortand long-range information about genomic structural variation

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call