Abstract
Remarkable progress has been made in image recognition, primarily due to the availability of large-scale annotated datasets and deep convolutional neural networks (CNNs). CNNs enable learning data-driven, highly representative, hierarchical image features from sufficient training data. However, obtaining datasets as comprehensively annotated as ImageNet in the medical imaging domain remains a challenge. There are currently three major techniques that successfully employ CNNs to medical image classification: training the CNN from scratch, using off-the-shelf pre-trained CNN features, and conducting unsupervised CNN pre-training with supervised fine-tuning. Another effective method is transfer learning, i.e., fine-tuning CNN models pre-trained from natural image dataset to medical image tasks. In this paper, we exploit three important, but previously understudied factors of employing deep convolutional neural networks to computer-aided detection problems. We first explore and evaluate different CNN architectures. The studied models contain 5 thousand to 160 million parameters, and vary in numbers of layers. We then evaluate the influence of dataset scale and spatial image context on performance. Finally, we examine when and why transfer learning from pre-trained ImageNet (via fine-tuning) can be useful. We study two specific computer-aided detection (CADe) problems, namely thoraco-abdominal lymph node (LN) detection and interstitial lung disease (ILD) classification. We achieve the state-of-the-art performance on the mediastinal LN detection, and report the first five-fold cross-validation classification results on predicting axial CT slices with ILD categories. Our extensive empirical evaluation, CNN model analysis and valuable insights can be extended to the design of high performance CAD systems for other medical imaging tasks.
Highlights
T REMENDOUS progress has been made in image recognition, primarily due to the availability of large-scale annotated datasets (i.e., ImageNet [1], [2]) and the recent revival of deep convolutional neural networks (CNN) [3], [4]
The CNN models trained upon this database serve as the backbone for significantly improving many object detection and image segmentation problems using other datasets [6], [7], e.g., PASCAL [8] and medical image categorization [9]–[12]
We explore and evaluate different CNN architectures varying in width and depth, describe the effects of varying dataset scale and spatial image context on performance, and discuss when and why transfer learning from pre-trained ImageNet CNN models can be valuable
Summary
T REMENDOUS progress has been made in image recognition, primarily due to the availability of large-scale annotated datasets (i.e., ImageNet [1], [2]) and the recent revival of deep convolutional neural networks (CNN) [3], [4]. Unlike previous image datasets used in computer vision, ImageNet [1] offers a very comprehensive database of more than 1.2 million categorized natural images of 1000+ classes. The CNN models trained upon this database serve as the backbone for significantly improving many object detection and image segmentation problems using other datasets [6], [7], e.g., PASCAL [8] and medical image categorization [9]–[12]. There exists no large-scale annotated medical image dataset comparable to ImageNet, as data acquisition is difficult, and quality annotation is costly. A decompositional 2.5D view resampling and an aggregation of random view classification scores are used to eliminate the “curse-of-dimensionality” issue in [22], in order to acquire a sufficient number of training image samples
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.