Abstract
In past few years, deep learning based medical image analysis technologies have significantly improved computer-assisted tasks like detecting, diagnosing, and predicting medical outcomes. The monitoring and diagnosis of ailments such as cancer and COVID-19 rely primarily on retrieving medical images. As the medical databases size is increasing rapidly it causes difficulty in managing and querying them. The main challenge lies in achieving superior retrieval performance while handling the intricacies of medical imaging datasets. To tackle this, we propose a novel end-to-end trainable framework for medical image retrieval using a vision transformer hashing with Supervised Contrastive (VTHSC) learning. Leveraging the self-attention mechanism of transformers and supervised contrastive loss, our model surpasses existing hashing-based retrieval methods. Focusing on the most recent COVID-19 and breast cancer datasets, we aim to efficiently retrieve relevant medical images from large datasets. Using the ImageNet-pretrained ViT as its backbone network, the VTHSC model incorporates a hashing head combined with a supervised contrastive loss and employs joint loss optimization. The VTHSC model is subsequently fine- tuned for a retrieval task, employing four different hashing frameworks: Deep Supervised Hashing (DSH), GreedyHash, Improved Deep Hashing Network (IDHN), and Deep Polarized Network (DPN). Our model achieves outstanding Mean Average Precision (MAP) scores of 98.9 for the BreakHis dataset and 96.03 for the COVID dataset. Notably, the VTHSC model surpasses several competing hashing-based retrieval methods by a substantial gain in terms of performance across various metrics such as precision, recall, and Top-6 retrieved images retrieval performance on the two benchmark medical image datasets.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have