The increasing utilization of medical imaging technology with digital storage capabilities has facilitated the compilation of large-scale data repositories. Fast access to image samples with similar appearance to suspected cases in these repositories can help establish a consulting system for healthcare professionals, and improve diagnostic procedures while minimizing processing delays. However, manual querying of large repositories is labor intensive. Content-based image retrieval (CBIR) offers an automated solution based on quantitative assessment of image similarity based on image features in a latent space. Since conventional methods based on hand-crafted features typically show poor generalization performance, learning-based CBIR methods have received attention recently. A common framework in this domain involves classifier-guided models that are trained to detect different image classes. Similarity assessments are then performed on the features captured by the intermediate stages of the trained models. While classifier-guided methods are powerful in inter-class discrimination, they are suboptimally sensitive to within-class differences in image features. An alternative framework instead performs task-agnostic training to learn an embedding space that enforces the representational discriminability of images. Within this representational-learning framework, a powerful method is triplet-wise learning that addresses the deficiencies of point-wise and pair-wise learning in characterizing the similarity relationships between image classes. However, the traditional triplet loss enforces separation between only a subset of image samples within the triplet via a manually-set constant margin value, so it can lead to suboptimal segregation of opponent classes and limited generalization performance. To address these limitations, we introduce a triplet-learning method for automated querying of medical image repositories based on a novel Opponent Class Adaptive Margin (OCAM) loss. To maintain optimally discriminative representations, OCAM considers relationships among all image pairs within the triplet and utilizes an adaptive margin value that is automatically selected per dataset and during the course of training iterations. CBIR performance of OCAM is compared against state-of-the-art loss functions for representational learning on three public databases (gastrointestinal disease, skin lesion, lung disease). On average, OCAM shows an mAP performance of 86.30% in the KVASIR dataset, 70.30% in the ISIC 2019 dataset, and 85.57% in the X-RAY dataset. Comprehensive experiments in each application domain demonstrate the superior performance of OCAM against competing triplet-wise methods at 1.52%, classifier-guided methods at 2.29%, and non-triplet representational-learning methods at 4.56%.
Read full abstract