Localizing global descriptors for content-based image retrieval

C. Iakovidou,M. Lux,N. Anagnostopoulos,A. Kapoutsis,Y. Boutalis,S.A. Chatzichristofis

doi:10.1186/s13634-015-0262-6

Abstract

In this paper, we explore, extend and simplify the localization of the description ability of the well-established MPEG-7 (Scalable Colour Descriptor (SCD), Colour Layout Descriptor (CLD) and Edge Histogram Descriptor (EHD)) and MPEG-7-like (Color and Edge Directivity Descriptor (CEDD)) global descriptors, which we call the SIMPLE family of descriptors. Sixteen novel descriptors are introduced that utilize four different sampling strategies for the extraction of image patches to be used as points of interest. Designing with focused attention for content-based image retrieval tasks, we investigate, analyse and propose the preferred process for the definition of the parameters involved (point detection, description, codebook sizes and descriptors’ weighting strategies). The experimental results conducted on four different image collections reveal an astonishing boost in the retrieval performance of the proposed descriptors compared to their performance in their original global form. Furthermore, they manage to outperform common SIFT- and SURF-based approaches while they perform comparably, if not better, against recent state-of-the-art methods that base their success on much more complex data manipulation.

Highlights

Extracting a meaningful descriptor from an image is a central problem for a variety of computer vision problems
We will provide the evaluation of the retrieval performances of the proposed SIMPLE descriptors and discuss the impact of the weighting schemes
When detecting patches using the Scale-invariant feature transform (SIFT) detector, and due to the percentage of non-usable patches, only SIMPLE sft-Scalable Colour Descriptor (SCD) manages to present a performance improvement compared to the baseline

Summary

Introduction

Extracting a meaningful descriptor from an image is a central problem for a variety of computer vision problems. The impact of factors such as the kind of features employed, computational complexity, storing requirements and scalability can vary significantly in different computer vision domains. We are interested in exploring the combination of features that best describe an image with respect to its visual properties and its visual content, focusing on content-based image retrieval (CBIR) tasks. When designing descriptors for CBIR, one must take into account the ever-growing data involved in the process. Image collections are growing exponentially in a variety of domains

Objectives

Results

Discussion

Conclusion