Abstract

Image retrieval aims to retrieve and return the image in the database that is most similar to the query image. However, the performance of image retrieval models is often hindered by the limited dimensionality of images, which lacks depth information about objects. To address this issue, we propose a novel image retrieval model called FFLDGA-Net (Feature Fusion-based Learnable Descriptor Graph Attention Network). This model aims to overcome the absence of depth information in images by fusing feature information from both image data and point cloud data. First, we introduce the LDGA-Net, which effectively improved the model’s ability to mine hard samples and negative samples. Then, we combine the multi-scale route convolution module with a one-dimensional path aggregation network to fuse point clouds and image features at multiple scales, and establish a high-dimensional relationship between features and low-dimensional features. To mitigate training noise, we incorporate a soft label strategy tailored to the dataset’s characteristics. Our experimental results on two benchmark datasets demonstrate the significant improvements achieved by FFLDGA-Net in image retrieval performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call