Abstract

The analysis of tumours using biomarkers in blood is transforming cancer diagnosis and therapy. Cancers are characterised by evolving genetic alterations, making it difficult to develop reliable and broadly applicable DNA-based biomarkers for liquid biopsy. In contrast to the variability in gene mutations, the methylation pattern remains generally constant during carcinogenesis. Thus, methylation more than mutation analysis may be exploited to recognise tumour features in the blood of patients. In this work, we investigated the possibility of using global CpG (CpG means a CG motif in the context of methylation. The p represents the phosphate. This is used to distinguish CG sites meant for methylation from other CG motifs or from mentions of CG content) island methylation profiles as a basis for the prediction of cancer state of patients utilising liquid biopsy samples. We retrieved existing GEO methylation datasets on hepatocellular carcinoma (HCC) and cell-free DNA (cfDNA) from HCC patients and healthy donors, as well as healthy whole blood and purified peripheral blood mononuclear cell (PBMC) samples, and used a random forest classifier as a predictor. Additionally, we tested three different feature selection techniques in combination. When using cfDNA samples together with solid tumour samples and healthy blood samples of different origin, we could achieve an average accuracy of 0.98 in a 10-fold cross-validation. In this setting, all the feature selection methods we tested in this work showed promising results. We could also show that it is possible to use solid tumour samples and purified PBMCs as a training set and correctly predict a cfDNA sample as cancerous or healthy. In contrast to the complete set of samples, the feature selections led to varying results of the respective random forests. ANOVA feature selection worked well with this training set, and the selected features allowed the random forest to predict all cfDNA samples correctly. Feature selection based on mutual information could also lead to better than random results, but LASSO feature selection would not lead to a confident prediction. Our results show the relevance of CpG islands as tumour markers in blood.

Highlights

  • Classifiers based on global methylation profiles of cancer tissue were established in recent years and are in the works for several more entities

  • Principal Component Analysis Shows the Difference between Healthy and Cancerous cell-free DNA (cfDNA) Samples

  • The principal components (PCs) are computed from CpG sites (a) as well as CpG islands (b) as features of the respective input data to the principal component analysis (PCA)

Read more

Summary

Introduction

Classifiers based on global methylation profiles of cancer tissue were established in recent years and are in the works for several more entities. It can distinguish the approximately 100 known tumour types of the CNS and was even shown to be more precise than conservative histological classification for some tumour types [1]. Another classifier for solid tumours was established by Wu et al, which can accurately diagnose and distinguish bone sarcomas: Ewing, chondro-, and osteosarcoma [2]. Both classifiers were based on random forests and used the CpG sites’ methylation b-values as features

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call