Abstract Purpose: Several cell-free DNA (cf-DNA) features, such as genome-wide coverage, fragment size, and fragment end motif frequency, have shown their potentials for cancer detection. In this study, we developed two independent models, GC (gross chromatin), and FEMS (fragment end motif frequency and size). Each model uses images generated from genome-wide normalized sequencing coverage and cf-DNA fragment end motif frequencies according to the different cf-DNA size profiles. Then we integrated them into a single ensemble model to improve cancer detection and multi-cancer type classification accuracy. Methods: Low depth cf-WGS data was generated from 1,396 patients (stage I: 14.9%, stage II: 35.6%, stage III: 24.9%, stage IV: 24.2%, unknown: 0.4%) with breast (n=702), liver (n=213), esophageal (n=155), ovarian (n=151), pancreatic (n=85), lung (n=53), head and neck (n=16), biliary tract (n=15), and colon cancer (n=6) and 417 healthy individuals. Samples were randomly split into training, validation, and test set stratifying cancer type and stages. Cancer types with a small number of samples (<20) were excluded for multi-cancer type classification. Each model was trained using a convolutional neural network, then integrated into a single ensemble model by averaging the predicted probabilities calculated from each model. Results: For cancer detection, the ensemble model achieved sensitivities of 85.2% [95% confidence interval (CI): 71.8% to 94.5%], 74.9% (CI: 68.0% to 88.0%), 73.2% (CI: 66.7% to 85.9%) at a specificity of 95%, 98% and 99% and the AUC value of 0.97(CI: 0.95-0.99) in the test dataset. By the cancer stages, sensitivity was 62.8% (CI: 48.8% to 83.7%) in stage I, 66.3% (CI: 57.7% to 82.7%) in stage II, 85.9% (CI: 77.5% to 94.4%) in stage III, and 76.1% (CI: 63.4% to 87.3%) in stage IV at 99% specificity. For multi-cancer classification, the overall accuracy of 85.1% (CI: 80.4% to 89.3%) was achieved including 6 cancer types. Conclusions: Highly sensitive and accurate deep learning model for cancer detection and multi-cancer classification was generated by combining different types of cf-DNA features. This result provides the opportunity for general population multi-cancer screening using various cf-DNA features. Citation Format: Tae-Rim Lee, Jin Mo Ahn, Joo Hyuk Sohn, Sook Ryun Park, Min Hwan Kim, Gun Min Kim, Ki-Byung Song, Eunsung Jun, Dongryul Oh, Jeong-Won Lee, Joseph J Noh, Young Sik Park, Sun-Young Kong, Sang Myung Woo, Bo Hyun Kim, Eui Kyu Chie, Hyun-Cheol Kang, Youn Jin Choi, Ki-Won Song, Jeong-Sik Byeon, Junnam Lee, Dasom Kim, Chang-Seok Ki, Eunhae Cho. Deep learning algorithm for multi-cancer detection and classification using cf-WGS [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 6371.
Read full abstract