PurposeUsing clinical information and transcriptomic sequencing data from glioblastoma (GBM) patients in the TCGA database to perform gene-by-gene analysis that is aligned with individual patient characteristics and develop an optimal prognostic index of survival-related variables (OPISV) through iterative machine learning techniques to predict the prognosis of GBM patients. Study designThe TCGA dataset was utilized as the training dataset, while two GEO datasets served as independent validation cohorts. Initially, survival analysis (p < 0.001***), differential gene expression analysis (p < 0.05*), and univariate Cox regression analysis (p < 0.05*) were employed to identify genes that are highly correlated with patient prognosis and exhibit significant differences in survival status. Subsequently, incorporating the non-excludable variable of age, a multivariate Cox regression analysis was performed in a stepwise manner to construct the OPISV. Finally, logistic and LASSO regressions were used to validate the association between the identified genes and patient survival. The OPISV performance is evaluated and its potential mechanisms are explored. ResultsAge, CTSD, PTPRN, PTPRN2, NSUN5, DNAJC30 and SOX21 emerged as the optimal variables through multivariate Cox regression iterations. Further analysis characterized Age, PTPRN and DNAJC30 as independent prognostic risk factors for constructing OPISV, which is validated with external GEO datasets and GEPIA database. In OPISV_high populations, significantly upregulated GABAergic synapse function was exposed. Differential genes identified from gene clustering of the GABAergic synapse pathway and gene module highly correlated with GABAergic synapse in the WGCNA analysis are pointing unequivocally to the glioma progress. ConclusionOPISV is feasible for predicting patient survival, as it may serve as a potential mechanism underlying the involvement of GABAergic synapses in the progression of GBM.
Read full abstract