Lung cancer is a leading cause of cancer mortality, highlighting the need for innovative non-invasive early detection methods. Although cell-free DNA (cfDNA) analysis shows promise, its sensitivity in early-stage lung cancer patients remains a challenge. This study aimed to integrate insights from epigenetic modifications and fragmentomic features of cfDNA using machine learning to develop a more accurate lung cancer detection model. To address this issue, a multi-centre prospective cohort study was conducted, with participants harbouring suspicious malignant lung nodules and healthy volunteers recruited from two clinical centres. Plasma cfDNA was analysed for its epigenetic and fragmentomic profiles using chromatin immunoprecipitation sequencing, reduced representation bisulphite sequencing and low-pass whole-genome sequencing. Machine learning algorithms were then employed to integrate the multi-omics data, aiding in the development of a precise lung cancer detection model. Cancer-related changes in cfDNA fragmentomics were significantly enriched in specific genes marked by cell-free epigenomes. A total of 609 genes were identified, and the corresponding cfDNA fragmentomic features were utilised to construct the ensemble model. This model achieved a sensitivity of 90.4% and a specificity of 83.1%, with an AUC of 0.94 in the independent validation set. Notably, the model demonstrated exceptional sensitivity for stage I lung cancer cases, achieving 95.1%. It also showed remarkable performance in detecting minimally invasive adenocarcinoma, with a sensitivity of 96.2%, highlighting its potential for early detection in clinical settings. With feature selection guided by multiple epigenetic sequencing approaches, the cfDNA fragmentomics-based machine learning model demonstrated outstanding performance in the independent validation cohort. These findings highlight its potential as an effective non-invasive strategy for the early detection of lung cancer. Our study elucidated the regulatory relationships between epigenetic modifications and their effects on fragmentomic features. Identifying epigenetically regulated genes provided a critical foundation for developing the cfDNA fragmentomics-based machine learning model. The model demonstrated exceptional clinical performance, highlighting its substantial potential for translational application in clinical practice.
Read full abstract