BackgroundGliomas are the most common primary brain tumours and constitute approximately half of all malignant glioblastomas. Unfortunately, patients diagnosed with malignant glioblastomas typically survive for less than a year. In light of this circumstance, genotyping is an effective means of categorizing gliomas. The Ki67 proliferation index, a widely used marker of cellular proliferation in clinical contexts, has demonstrated potential for predicting tumour classification and prognosis. In particular, magnetic resonance imaging (MRI) plays a vital role in the diagnosis of brain tumours. Using MRI to extract glioma-related features and construct a machine learning model offers a viable avenue to classify and predict the level of Ki67 expression. MethodsThis study retrospectively collected MRI data and postoperative immunohistochemical results from 613 glioma patients from the …… Hospital. Subsequently, we performed registration and skull stripping on the four MRI modalities: T1-weighted (T1), T2-weighted (T2), T1-weighted with contrast enhancement (T1CE), and Fluid Attenuated Inversion Recovery (FLAIR). Each modality's segmentation yielded three distinct tumour regions. Following segmentation, a comprehensive set of features encompassing texture, first-order, and shape attributes were extracted from these delineated regions. Feature selection was conducted using the least absolute shrinkage and selection operator (LASSO) algorithm with subsequent sorting to identify the most important features. These selected features were further analysed using correlation analysis to finalise the selection for machine learning model development. Eight models: logistic regression (LR), naive bayes, decision tree, gradient boosting tree, and support vector classification (SVM), random forest (RF), XGBoost, and LightGBM were used to objectively classify Ki67 expression. ResultsIn total, 613 patients were enrolled in the study, and 24,455 radiomic features were extracted from each patient’s MRI. These features were eventually reduced to 36 after LASSO screening, RF importance ranking, and correlation analysis. Among all the tested machine learning models, LR and linear SVM exhibited superior performance. LR achieved the highest area under the curve score of 0.912 ± 0.036, while linear SVM obtained the top accuracy with a score of 0.884 ± 0.031. ConclusionsThis study introduced a novel approach for classifying Ki67 expression levels using MRI, which has been proven to be highly effective. With the LR model at its core, our method demonstrated its potential in signalling a promising avenue for future research. This innovative approach of predicting Ki67 expression based on MRI features not only enhances our understanding of cell activity but also represents a significant leap forward in brain glioma research. This underscores the potential of integrating machine learning with medical imaging to aid in the diagnosis and prognosis of complex diseases.