The most common oncologic cause of mortality in children is pediatric glioblastoma, an extremely dangerous brain tumor. The tumor progress is almost inevitable and recurs after first-line standard care. Because surgical resection is often more effective when tumors are localized and smaller, early identification and action may be essential to assure favourable outcomes for the recurring disease. This study aims to employ single-cell RNA-Sequencing data (scRNA-Seq data) for clustering and explainable Artificial intelligence framework to find gene biomarkers and signature cell types for the diagnosis and prognosis of reoccurring pediatric glioblastoma. Distinct cell types and statistically significant DEGs were found using scRNA-Seq data retrieved from the Gene Expression Omnibus database. Random forest (RF) and extreme gradient boosting (XGBoost) machine learning (ML) classifiers were constructed to select genes significantly contributing to the disease using Shapley (SHAP) values, an explainable artificial intelligence (EAI) framework. Potential biomarkers were chosen based on the shared genes among statistically discovered DEGs and SHAP-based relevance. B cells, macrophages, CD8+ T cells, T cells, and NK cells were identified as distinct cell types, which played an essential role in disease recurrence. Also, five significant genes, namely HMGB2, H2AFZ, HIST1H4C, KIAA0101, and DUT, were screened and in silico validated through survival analysis and feature plot, hence, proposed as biomarkers for recurring pediatric glioblastoma. Utilising these five genes may improve disease prognosis and provide a crucial understanding of the molecular causes of recurrent pediatric glioblastoma.
Read full abstract