Abstract

With the huge increase of large-scale multimedia over Internet, especially images, building Content-Based Image Retrieval (CBIR) systems for large-scale images has become a big challenge. One of the drawbacks associated with CBIR is the very long execution time. In this article, we propose a fast Content-Based Image Retrieval system using Spark (CBIR-S) targeting large-scale images. Our system is composed of two steps. (i) image indexation step, in which we use MapReduce distributed model on Spark in order to speed up the indexation process. We also use a memory-centric distributed storage system, called Tachyon, to enhance the write operation (ii) image retrieving step which we speed up by using a parallel k-Nearest Neighbors (k-NN) search method based on MapReduce model implemented under Apache Spark, in addition to exploiting the cache method of spark framework. We have showed, through a wide set of experiments, the effectiveness of our approach in terms of processing time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call