Comparing Popular CNN Models for an Imbalanced Dataset of Dermoscopic Images

Erkan Duman,Zafer Tolan

doi:10.53070/bbd.990574

Abstract

In this study, the performance of popular convolution architectures against an imbalanced dataset is analyzed in detail with a multi-classing medical image processing application. Our selection for dermoscopic images is a large-scale and imbalanced dataset consisting of 10,015 colored lesion images belonging to 7 different skin diseases, was used as a benchmark. Images without pathological testing are labeled by specialist dermatologists who are members of International Skin Imaging Association. The f1-score was preferred as the measurement metric during the training phase of the convolution networks, which were trained with imbalanced dataset, and the area under the receiver operating characteristic curve and the confusion matrix of each model were calculated at the test phase. In the validation phase of convolution networks, k-fold cross validation technique was used. In addition, the filters obtained from the ImageNet dataset have been imported with the Transfer-Learning option. Fine-tuning was applied to the deepest convolution layers in order for these pre-trained models to develop themselves specific to our application. In order to prevent the overfit problem, the feature extraction outputs of the models were drop-out at a rate of 50% after flattening, and L2-regularization (weigh decay) was applied during the update phase of the weights. Although it is not the main purpose of the study, in order to partially improve the performance of convolution architectures, synthetic lesion images created with data-augmentation for the minor classes in the imbalanced dataset were included in the training process in a way that does not cause information leakage.

Full Text