Abstract

Various deep convolutional neural networks (CNNs) have been used to distinguish between benign and malignant pulmonary nodules using CT images. However, single learner usually presents unsatisfied performance due to limited hypothesis space, or falling into local minima, or wrong selection of hypothesis space. To tackle these issues, we propose to build ensemble learners through fusing multiple deep CNN learners for pulmonary nodules classification. CT image patches of 743 nodules are extracted from LIDC-IDRI database and utilized. First, eight deep CNN learners with different architectures are trained and evaluated by 10-fold cross-validation. Each nodule has eight predictions from the eight primary learners. Second, we fuse these eight predictions by the strategies of majority voting (VOT), averaging (AVE), or machine learning. Specifically, different machine learning algorithms including K-Nearest-Neighbor (KNN), Support Vector Machines (SVM), Naive Bayes (NB), Decision Trees (DT), Multi-layer Perceptron (MLP), Random Forests (RF), Gradient Boosting Regression Trees (GBRT) and Adaptive Boosting (AdaBoost) are implemented. Moreover, the correlation coefficients between the predictions of 10 ensemble learners are calculated, and the hierarchical clustering dendrogram is drawn. It is found that the ensemble learners achieve higher prediction accuracy (84.0% vs 81.7%) than single CNN learner. The overlap ratio among the 10 ensemble learners is much higher than that of the 8 primary learners (62.9% vs 33.2%). In addition, it is shown that ensemble learners are roughly divided into three categories: the first (SVM, MLP, GBRT and RF) achieves the best performance; the second (VOT and AVE) is better than the third (AdaBoost, DT, NB and KNN). VOT and AVE yield higher recall than the machine learning algorithms. These results indicate that ensemble learners based on multiple CNN learners can achieve better performances for pulmonary nodules classification using CT images and that preferred fusion strategies include SVM, MLP, GBRT and RF.

Highlights

  • Lung cancer is the most common malignancy and is the leading cause of the cancer deaths worldwide [1]

  • Consistent with previous studies, our results indicate that the ensemble learners have better performance than single convolutional neural networks (CNNs) model and significantly reduce the variance of the primary learners

  • According to the hierarchical clustering dendrogram of the prediction results yielded by each of the 10 ensemble methods, we found that Support Vector Machines (SVM) and Multi-layer Perceptron (MLP) are in one clade, Random Forests (RF), Gradient Boosting Regression Trees (GBRT), SVM and MLP are in a large clade

Read more

Summary

Introduction

Lung cancer is the most common malignancy and is the leading cause of the cancer deaths worldwide [1]. Advanced stage, and the five-year survival rate is only 17.8%, which is much lower than that of other leading cancers [2]. Using CT screening, early detection of lung cancer in the form of pulmonary nodules can increase the 5-year survival rate significantly (up to 55%) [3], [4]. B. Zhang et al.: Ensemble Learners of Multiple Deep CNNs for Pulmonary Nodules Classification Using CT Images the automatic classification of the early detected pulmonary nodules into benign and malignant categories is essential for the clinical decision [5], [6]. The nodules with high likelihood of malignance are recommended for biopsy test or surgical resection, and the ones with low likelihood are for CT surveillance [7]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call