Abstract

With the recent evolution of technology, the number of image archives has increased exponentially. In Content-Based Image Retrieval (CBIR), high-level visual information is represented in the form of low-level features. The semantic gap between the low-level features and the high-level image concepts is an open research problem. In this paper, we present a novel visual words integration of Scale Invariant Feature Transform (SIFT) and Speeded-Up Robust Features (SURF). The two local features representations are selected for image retrieval because SIFT is more robust to the change in scale and rotation, while SURF is robust to changes in illumination. The visual words integration of SIFT and SURF adds the robustness of both features to image retrieval. The qualitative and quantitative comparisons conducted on Corel-1000, Corel-1500, Corel-2000, Oliva and Torralba and Ground Truth image benchmarks demonstrate the effectiveness of the proposed visual words integration.

Highlights

  • Content-Based Image Retrieval (CBIR) provides a potential solution to the challenges posed when retrieving images that are similar to the query image [1, 2]

  • Scale Invariant Feature Transform (SIFT) and SpeededUp Robust Features (SURF) are reported as two robust local features [22] and both are evaluated on different image datasets [22,23,24]

  • We show that by integrating the visual words of SIFT and SURF, more precise, effective, and reliable image retrieval results can be obtained

Read more

Summary

Introduction

CBIR provides a potential solution to the challenges posed when retrieving images that are similar to the query image [1, 2]. Occlusion, overlapping objects, spatial layout, image resolution, variations in illumination, semantic gap and the exponential growth in multimedia contents make CBIR a challenging research problem [1,2,3]. In CBIR, an image is represented as a feature vector that consists of low-level image features [2]. The closeness of the feature vector values of a query image to the images placed in an archive determines the output [4]. Texture and shape are examples of the global low-level features that can describe the content-based attributes of an image [2]. The color features do not represent spatial distribution; the closeness of the color values of two images belonging to different classes results in the output of irrelevant images [1, 2]. Spatial texture techniques are sensitive to noise and distortion, while spectral texture techniques work effectively on square regions by using the Fast Fourier Transform

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call