Abstract
Abstract Introduction: Repeat sequences comprise >50% of the human genome and structural and epigenetic changes in these regions are implicated in cancer. However, no systematic analysis of the compendium of repeat sequences has ever been performed in human cancer or cell-free DNA (cfDNA), largely due to the inability to identify and quantify repeat sequences genome-wide. We describe here the first comprehensive analysis of genome-wide repeat landscapes in cancer and demonstrate their utility in cfDNA liquid biopsies. Methods: We developed ARTEMIS (Analysis of RepeaT EleMents in dISease) an alignment-free, genome-wide approach for analyzing repeat landscapes in short read sequencing. This approach uses a de novo search of short sequences (kmers) in the telomere to telomere (chm13) reference genome to identify 1.2 billion 24-mers uniquely defining 1280 individual repeat types occurring genome-wide across 57 subfamilies and 6 families. We analyzed ARTEMIS kmers in whole genome sequences of 525 matched tumor/normal pairs from breast, colorectal, liver, lung, ovarian, cervical, prostate, thyroid, head and neck, gastric, and bladder cancers in the Pan Cancer Analysis of Whole Genomes (PCAWG), and in low coverage (1-2x) whole genome sequences of 1450 cfDNA samples from individuals with and without 8 types of cancer. Results: Analysis of ARTEMIS kmer repeat landscapes in 525 PCAWG tumors identified changes in all 1280 repeat element types, including 820 novel elements not previously known to be altered in cancer. A median of 807 repeat elements (range 246-1280) were altered in each tumor compared to its matched normal. The majority of changes were in repeat elements not previously described as altered in tumorigenesis and were most frequently found within Satellites, LINEs and SINEs, though changes were also observed in LTRs, Transposable Elements, and RNA Elements. A cross-validated cfDNA model using repeat landscapes (ARTEMIS) and fragmentation features (DELFI) detected individuals in a diagnostic cohort (n=287) across all stages of lung cancer with high performance (AUC 0.91, 95% CI 0.88-0.95) and was externally validated in a separate population (n=513). The locked model generated scores that correlated with circulating tumor mutant allele fractions for patients (n=19) undergoing targeted lung cancer therapy (r=0.80, p<2.2e-16), and stratified progression-free survival (p<0.001). ARTEMIS repeat landscape analyses of cfDNA also detected liver cancer in a high-risk cohort (n=208) of individuals with cirrhosis or viral hepatitis (AUC 0.91, 95% CI 0.87-0.95), and identified tissue of origin among seven tumor types (n=423). Conclusions: ARTEMIS reveals genome-wide repeat landscapes in human cancer, including in 820 novel elements not previously known to be altered in tumorigenesis. These repeat landscapes that can now be described are evaluable in the circulation and provide an avenue for noninvasive detection and characterization of cancer. Citation Format: Akshaya Annapragada, Noushin Niknafs, James R. White, Daniel C. Bruhm, Christopher Cherry, Jamie E. Medina, Vilmos Adleff, Carolyn Hruban, Dimitrios Mathios, Zachariah H. Foda, Jillian Phallen, Robert B. Scharpf, Victor E. Velculescu. Genome-wide repeat landscapes in cancer and cell-free DNA [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 988.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have