Abstract Background: cfDNA methylation profiling allows early cancer detection and tissue-of-origin classification. Recent studies show that end repair (ER) process during double strand library preparation (dsLibP) introduces differential methylation signals between original and repaired sequences. We hypothesized that algorithmic correction of cfDNA jagged-ends could allow superior cfDNA methylation profiling performance, and better preserve double-stranded cfDNA molecules. Methods: We introduce JEEPERS (Jagged-End Error Polishing of Enzymatically misRepaired Sequences), a novel approach for correction of such errors in dsLibP methylation data. Specifically, JEEPERS is not only able to detect and quantify ER-induced errors at JEs, but also to correct them in silico. When using double-stranded unique molecular identifiers, JEEPERS leverages support from 2 sources to identify/correct errors at JEs: 1) complementary strands from individual DNA duplexes, and 2) sibling reads from other, non-jagged cfDNA families. JEEPERS relies on the 5’>3’ polarity of ER-associated errors and JE length profiles to correct R2 for longer fragments and both R1/2 for shorter ones. Results: We quantified the prevalence of such errors at JEs of double-stranded cfDNA molecules in plasma of 2 healthy subjects. We compared library preparations on single- (SPLAT-Seq) vs double-stranded DNA (EM-Seq), and strikingly found ~40% of CpGs to be involved in jagged cfDNA ends and impacted by misrepair during enzymatic ER of dsDNA. This resulted in an artifactual 13.1-16.3% lower global methylation across the genome, an effect that was exaggerated in hypermethylated (hyper) regions (~69% of genome). While ER-associated misrepair resulted in ~10.2% lower CpG methylation within the first 110bp of cfDNA fragments (R1 AUC=0.95, R2 AUC=0.81), when comparing R1/R2, the final 30bp showed even larger biases (13.8-77.4%). Remarkably, JEEPERS could accurately detect and correct these biases and produce highly uniform R1/R2 profiles in EM-Seq on par with SPLAT-Seq, even when considering short cfDNA fragments (<141bp) with substantial JE fractions. We also compared JEEPERS performance with other methods, including simple JE trimming or reliance on R1 alone, finding superior performance by JEEPERS for methylation quantification. Conclusions: ER distorts a large fraction of CpGs across the genome clustering at cfDNA fragment ends, resulting in substantially decreased methylation levels in cfDNA. JEEPERS accurately recovers methylation levels distorted by such misrepair. This not only enables more accurate methylation levels, but also retains cfDNA fragmentation features and duplex information. These advantages allow not only superior cfDNA genotyping of germline and somatic variants, but also enable better identification of allele-specific methylation, hemi-methylation, and gene expression inferences, as relevant for cancer detection. Citation Format: Rui Wang, Emily G. Hamilton, Angela Hui, Diego Almanza, Mohammad S. Esfahani, Maximilian Diehn, Ash A. Alizadeh. Improved cfDNA methylation profiling through correction of misrepaired jagged-ends [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 1024.
Read full abstract