The performance of supervised deep learning image classifiers has significantly improved with large, labeled datasets and increased computing power. However, obtaining large, labeled image datasets in areas like medicine is expensive. This study seeks to improve model performance on limited labeled datasets by reducing confusion. We observed that misclassification (or confusion) between classes is usually more prevalent between specific classes. Thus, we developed a synthesized image training technique (SIT2), a novel confusion-based training framework that identifies pairs of classes with high confusion and synthesizes not-sure images from these pairs. The not-sure images are utilized in three new training strategies as follows: (1) the not-sure training strategy pretrains a model using not-sure images and the original training images, (2) the sure-or-not strategy pretrains with synthesized sure or not-sure images, and (3) the multi-label strategy pretrains with synthesized images but predicts the original class(es) of the synthesized images. Finally, the pretrained model is fine-tuned on the original dataset. An extensive evaluation was conducted on five medical and nonmedical datasets. Several improvements are statistically significant, which shows the promising future of our confusion-based training framework.
Read full abstract