Background: Lung cancer still maintains the leading position among causes of death in the world; the process of early detection surely contributes to changes in the survival of patients. Standard diagnostic methods are grossly insensitive, especially in the early stages. In this paper, radiomic features are discussed that can assure improved diagnostic accuracy through automated lung cancer detection by considering the important feature categories, such as texture, shape, and intensity, originating from the CT DICOM images. Methods: We developed and compared the performance of two machine learning models—DenseNet-201 CNN and XGBoost—trained on radiomic features with the ability to identify malignant tumors from benign ones. Feature importance was analyzed using SHAP and techniques of permutation importance that enhance both the global and case-specific interpretability of the models. Results: A few features that reflect tumor heterogeneity and morphology include GLCM Entropy, shape compactness, and surface-area-to-volume ratio. These performed excellently in diagnosis, with DenseNet-201 producing an accuracy of 92.4% and XGBoost at 89.7%. The analysis of feature interpretability ascertains its potential in early detection and boosting diagnostic confidence. Conclusions: The current work identifies the most important radiomic features and quantifies their diagnostic significance through a properly conducted feature selection process reflecting stability analysis. This provides the blueprint for feature-driven model interpretability in clinical applications. Radiomics features have great value in the automated diagnosis of lung cancer, especially when combined with machine learning models. This might improve early detection and open personalized diagnostic strategies for precision oncology.
Read full abstract