Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA

Nathan Wan,Yaping Liu,Tzu-Yu Liu,Eric A Ariazi,Adam Drake ,Girish Putcha,Brandon White,Daniel Delubac,Signe Fransen,Catherina Tang,Riley Ennis,David S Weinberg ,Jennifer Pecson,Gabriel Otte ,John St John,Nathan Boley,Leilani Young,Ajay Kannan,Larry G Hansen ,James M Cregg ,Aarushi Sharma,Brandon J Rice ,Erik Gafni,Marvin Bertin,Mitch Bailey,Gabriel E Sanderson,Abraham Tzou,Katherine E Niehaus ,Derek Bowen,Imran S Haque

doi:10.1186/s12885-019-6003-8

Nathan Wan, Yaping Liu + Show 28 more

Open Access

https://doi.org/10.1186/s12885-019-6003-8

Copy DOI

Journal: BMC Cancer	Publication Date: Aug 23, 2019
Citations: 126	License type: open-access

Affiliation: Freenome (United States)

Abstract

BackgroundBlood-based methods using cell-free DNA (cfDNA) are under development as an alternative to existing screening tests. However, early-stage detection of cancer using tumor-derived cfDNA has proven challenging because of the small proportion of cfDNA derived from tumor tissue in early-stage disease. A machine learning approach to discover signatures in cfDNA, potentially reflective of both tumor and non-tumor contributions, may represent a promising direction for the early detection of cancer.MethodsWhole-genome sequencing was performed on cfDNA extracted from plasma samples (N = 546 colorectal cancer and 271 non-cancer controls). Reads aligning to protein-coding gene bodies were extracted, and read counts were normalized. cfDNA tumor fraction was estimated using IchorCNA. Machine learning models were trained using k-fold cross-validation and confounder-based cross-validations to assess generalization performance.ResultsIn a colorectal cancer cohort heavily weighted towards early-stage cancer (80% stage I/II), we achieved a mean AUC of 0.92 (95% CI 0.91–0.93) with a mean sensitivity of 85% (95% CI 83–86%) at 85% specificity. Sensitivity generally increased with tumor stage and increasing tumor fraction. Stratification by age, sequencing batch, and institution demonstrated the impact of these confounders and provided a more accurate assessment of generalization performance.ConclusionsA machine learning approach using cfDNA achieved high sensitivity and specificity in a large, predominantly early-stage, colorectal cancer cohort. The possibility of systematic technical and institution-specific biases warrants similar confounder analyses in other studies. Prospective validation of this machine learning method and evaluation of a multi-analyte approach are underway.

Highlights

Blood-based methods using cell-free Deoxyribonucleic acid (DNA) are under development as an alternative to existing screening tests
Paired-end whole-genome sequencing (WGS) was performed on plasma cell-free DNA (cfDNA) obtained from 271 non-cancer control subjects and 546 colorectal cancer (CRC) patients (Table 1)
We have demonstrated that it is possible to take an MLbased approach to learn the relationship between a patient’s cfDNA profile and cancer diagnosis, with 85% sensitivity at 85% specificity in CRC using standard kfold cross-validation; application of other rigorous and novel CV strategies designed to control for known confounding variables yielded 71–85% sensitivity at 85% specificity

Summary

Introduction

Blood-based methods using cell-free DNA (cfDNA) are under development as an alternative to existing screening tests. Early-stage detection of cancer using tumor-derived cfDNA has proven challenging because of the small proportion of cfDNA derived from tumor tissue in early-stage disease. A machine learning approach to discover signatures in cfDNA, potentially reflective of both tumor and non-tumor contributions, may represent a promising direction for the early detection of cancer. Blood-based screening tests for cancer have been proposed in an effort to address some of the aforementioned challenges. One key area of both academic and commercial interest is circulating cell-free DNA (cfDNA), which includes both tumor-derived DNA (socalled “circulating tumor DNA”, or ctDNA) and DNA derived from non-tumor cells, such as hematopoietic and stromal cells, to supplement or replace existing cancer screening methods

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Cancer

Lead the way for us

Similar Papers

Su1658 – Machine Learning Enables Detection of Early-Stage Colorectal Cancer by Whole-Genome Sequencing of Plasma Cell-Free Dna
Nathan Wan ...
Gastroenterology | VOL. 156
Nathan Wan, et. al.Nathan Wan ...
01 May 2019
Su1658 – Machine Learning Enables Detection of Early-Stage Colorectal Cancer by Whole-Genome Sequencing of Plasma Cell-Free Dna
Nathan Wan ...

Abstract 2316: Integrated genomic and epigenomic cell-free DNA (cfDNA) analysis for the detection of early-stage colorectal cancer
Oscar Westesson ... Darya Chudova
Cancer Research | VOL. 80
Oscar Westesson, et. al.Oscar Westesson ... Darya Chudova
13 Aug 2020
Cancer Research | VOL. 80

Novel DNA methylation biomarkers show high sensitivity and specificity for blood-based detection of colorectal cancer\u2014a clinical biomarker discovery and validation study
Sarah Østrup Jensen ... Mai-Britt Worm Ørntoft
Clinical Epigenetics | VOL. 11
Sarah Østrup Jensen, et. al.Sarah Østrup Jensen ... Mai-Britt Worm Ørntoft
14 Nov 2019
Clinical Epigenetics | VOL. 11

Accurate early-stage colorectal cancer detection through analysis of cell-free circulating tumor DNA (ctDNA) methylation patterns.
James M Kinross ... Michael H A Roehrl
Journal of Clinical Oncology | VOL. 39
James M Kinross, et. al.James M Kinross ... Michael H A Roehrl
20 May 2021
Journal of Clinical Oncology | VOL. 39

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Cancer