Abstract

In this article, a media storm indexing mechanism is presented, where media storms are defined as fast incoming batches. We propose an approximate media storm indexing mechanism to index/store massive image collections with varying incoming image rate. To evaluate the proposed indexing mechanism, two architectures are used: i) a baseline architecture, which utilizes a disk-based processing strategy and ii) an in-memory architecture, which uses the Flink distributed stream processing framework. This study is the first in the literature to utilize an in-memory processing strategy to provide a media storm indexing mechanism. In the experimental evaluation conducted on two image datasets, among the largest publicly available with 80 M and 1 B images, a media storm generator is implemented to evaluate the proposed media storm indexing mechanism on different indexing workloads, that is, images that come with high volume and different velocity at the scale of $10^5$ and $10^6$ incoming images per second. Using the approximate media storm indexing mechanism a significant speedup factor, equal to 26.32 on average, is achieved compared with conventional indexing techniques, while maintaining high search accuracy, after having indexed the media storms. Finally, the implementations of both architectures and media storm indexing mechanisms are made publicly available.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.