Abstract

3058 Background: Circulating cell-free DNA (cfDNA) is largely nucleosomal in origin with typical fragment lengths of 167 base-pairs reflecting the length of DNA wrapped around-the histone and H1 linker. Given the nucleosomal origin of cfDNA, we have previously used low coverage whole genome sequencing to evaluate DNA fragmentation profiles to sensitively and specifically detect tumor-derived DNA with altered fragment lengths or coverage. Methods: Here we evaluate the use of Bayesian finite mixtures to model the fragment length distribution and demonstrate how the parameters from these models can be useful to distinguish between individuals with and without cancer. We examined the number of cfDNA fragments by size ranging from 100-220bp and approximated the mixture component location, scale, and weight using Markov Chain Monte Carlo. The performance of the method was determined using a ten-fold, ten repeat cross-validation of Gradient Boosted Machine model using 1) our previously described genome-wide fragmentation profile approach, 2) the parameters from the mixture model and 3) a combination of approaches 1) and 2) as features. Results: In this study of 215 cancer patients and 208 cancer-free individuals, we observed cross-validated AUCs of 1) 0.94, 2) 0.95, and 3) 0.97 among the three approaches. Conclusions: Our findings indicate that parsimonious mixture models may improve detection of cancer in conjunction with fragmentation profile analyses across the genome.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call