Abstract

Abstract. Image retrieval is one of the supporting technologies for (near) real-time photogrammetry and loop closure detection in visual SLAM, the conventional retrieval strategy is to firstly obtain the image features of the query image and database images, and search for the resulted images based on nearest features retrieval. However, the image retrieval method based on traditional hand-crafted features (SIFT, SURF, GIST) are hard to guarantee both the efficiency of time and precision in practical applications. Nowadays, learning-based features have shown superior performance in ample computer vision tasks. Thus, this paper investigates several popular learning-based global features (ResNet101, VGG16+NetVLAD, Yolov3+VGG16+NetVLAD) and local features (SuperPoint), to take care of both time efficiency and precision, we present hierarchical image retrieval solutions that combines these two kinds of features, in which global feature is for accelerating searching speed and local feature is for precision. Specifically, three sets of hierarchical retrieval solutions are designed by various combinations of learning-based global feature and local feature. Their precision and time efficiency are compared on different public benchmarks (one contains more than 10,000 images), the experimental results show that among the proposed solutions, VGG16+NetVLAD+SuperPoint has the best performance in efficiency, but the precision is slightly lower than the solution preprocessed with Yolov3.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call