Introduction: The remarkable growth of lung cancer and its associated impacts and consequences, along with the substantial costs it imposes on society, has driven the medical community to pursue programs aimed at further examination, prevention, early detection, and diagnosis. In medicine science, timely discovery and diagnosis of diseases can prevent many life-threatening conditions and save people's lives.Material and Methods: This study aims to predict lung cancer using a novel feature selection method integrated with a classifier. Our approach entails a comprehensive four-stage method. Initially, we calculate feature similarities within a lung cancer dataset using the absolute value of the Pearson correlation coefficient, followed by the clustering of initial features using the community detection algorithm called Louvain. Next, we employ techniques to determine the optimal subset of features using the concept of node centrality. Ultimately, lung cancer diagnosis is executed using the selected features, leveraging a classifier.Results: Comparative analysis reveals that our proposed method outperforms existing techniques in terms of reduced execution time and improved prediction accuracy. When compared with established methods, our approach demonstrates superior outcomes in terms of the number of selected features and classification accuracy. Our method reduced 12600 features to 118 features and its accuracy was 95.28, 95.49, 95.23 and 95.32 for Support Vector Machine (SVM), Decision Tree (DT), Naive Bayes (NB) and K-Nearest Neighbor (KNN) classifier. The comparison of runtime shows that the proposed method is significantly improved with a runtime of 2.146 seconds compared to other methods.Conclusion: The proposed feature selection method successfully reduced the initial feature set and significantly decreased computational time. Moreover, the achieved prediction accuracies underscore the reliability of our approach. This significant reduction in feature space while maintaining consistently high prediction accuracies serves as a strong validation of the potency and practical applicability of our methodology in the domain of lung cancer prediction. These compelling results strongly advocate for the potential real-world impact of our approach.
Read full abstract