Lung cancer is one of the most common forms of cancer resulting in over a million deaths per year worldwide. In this paper, the usage of support vector machine (SVM) classification for lung cancer is investigated, presenting a systematic quantitative evaluation against Boosting, Decision trees, k-nearest neighbor, LASSO regressions, neural networks and random forests. A large database of 5984 regions of interest (ROIs) and 488 input features (including textural features, patient characteristics, and morphological features) were used to train the classifiers and evaluate for their performance. The evaluation for classifiers’ performance was based on a tenfold cross validation framework, receiver operating characteristic curve (ROC), and Matthews correlation coefficient. Area under curve (AUC) of SVM, Boosting, Decision trees, k-nearest neighbor, LASSO, neural networks, random forests were 0.94, 0.86, 0.73, 0.72, 0.91, 0.92, and 0.85, respectively. It was proved that SVM classification offered significantly increased classification performance compared to the reference methods. This scheme may be used as an auxiliary tool to differentiate between benign and malignant SPNs of CT images in future
Read full abstract