In the lung computer-aided detection (Lung CAD) system, the region of interest (ROI) of lung nodules has more false positives, making the imbalance between positive and negative (true positive and false positive) samples more likely to lead to misclassification of true positive nodules, a cost-sensitive multikernel learning support vector machine (CS-MKL-SVM) algorithm is proposed. Different penalty coefficients are assigned to positive and negative samples, so that the model can better learn the features of true positive nodules and improve the classification effect. To further improve the detection rate of pulmonary nodules and overall recognition accuracy, a score function named F-new based on the harmonic mean of accuracy (ACC) and sensitivity (SEN) is proposed as a fitness function for subsequent particle swarm optimization (PSO) parameter optimization, and a feasibility analysis of this function is performed. Compared with the fitness function that considers only accuracy or sensitivity, both the detection rate and the recognition accuracy of pulmonary nodules can be improved by this new algorithm. Compared with the grid search algorithm, using PSO for parameter search can reduce the model training time by nearly 20 times and achieve rapid parameter optimization. The maximum F-new obtained on the test set is 0.9357 for the proposed algorithm. When the maximum value of F-new is achieved, the corresponding recognition ACC is 91%, and SEN is 96.3%. Compared with the radial basis function in the single kernel, the F-new of the algorithm in this paper is 2.16% higher, ACC is 1.00% higher and SEN is equal. Compared with the polynomial kernel function in the single kernel, the F-new of the algorithm is 3.64% higher, ACC is 1.00% higher and SEN is 7.41% higher. The experimental results show that the F-new, ACC and SEN of the proposed algorithm is the best among them, and the results obtained by using multikernel function combined with F-new index are better than the single kernel function. Compared with the MKL-SVM algorithm of grid search, the ACC of the algorithm in this paper is reduced by 1%, and the results are equal to those of the MKL-SVM algorithm based on PSO only. Compared with the above two algorithms, SEN is increased by 3.71% and 7.41%, respectively. Therefore, it can be seen that the cost sensitive method can effectively reduce the missed detection of nodules, and the availability of the new algorithm can be further verified.
Read full abstract