Abstract The manual identification of brain cancer types is often fraught with inaccuracies, leading to potential delays in diagnosis and treatment planning. This study presents a novel approach to predict brain cancer types using advanced machine learning (ML) algorithms integrated with sophisticated feature selection techniques. A multi-class classification framework was developed and evaluated, incorporating six ML models: Bernoulli Naive Bayes, K-nearest neighbors classifier, decision tree classifier, Gaussian process classifier (GPC), passive aggressive classifier, and perceptron. To enhance model performance, feature selection methods including the Gini index, mutual information, and principal component analysis (PCA) were employed. A comprehensive case study was conducted to assess the predictive accuracy of these models. The GPC, when trained and validated on features derived via PCA, outperformed other models in terms of predictive accuracy and generalization. Specifically, the dimensions identified by PCA (d1, d2, d3, and d4) were most effective in distinguishing between different brain cancer types. This methodology resulted in a significant improvement across various performance metrics. Compared to the baseline GPC model using all original features, the PCA-enhanced GPC achieved remarkable increases in Accuracy, Precision, Recall, and F1 Score by 294.31%, 22.14%, 294.31%, and 878.18%, respectively. These findings underscore the potential of combining ML algorithms with targeted feature selection techniques to advance the accuracy of brain cancer type prediction, offering substantial benefits for clinical decision-making and patient outcomes.
Read full abstract