Abstract

The real-time analysis of Big Data streams is a terrific resource for transforming data into value. For this, Big Data technologies for smart processing of massive data streams are available, but the facilities they offer are often too raw to be effectively exploited by analysts. RAM3S (Real-time Analysis of Massive MultiMedia Streams) is a framework that acts as a middleware software layer between multimedia stream analysis techniques and Big Data streaming platforms, so as to facilitate the implementation of the former on top of the latter. RAM3S has been proven helpful in simplifying the deployment of non-parallel techniques to streaming platforms, such as Apache Storm or Apache Flink. In this paper, we show how RAM3S has been updated to incorporate novel stream processing platforms, such as Apache Samza, and to be able to communicate with different message brokers, such as Apache Kafka. Abstracting from the message broker also provides us with the ability to pipeline several RAM3S instances that can, therefore, perform different processing tasks. This represents a richer model for stream analysis with respect to the one already available in the original RAM3S version. The generality of this new RAM3S version is demonstrated through experiments conducted on three different multimedia applications, proving that RAM3S is a formidable asset for enabling efficient and effective Data Mining and Machine Learning on multimedia data streams.

Highlights

  • Multimedia (MM) data have been used in a plethora of applications for at least three decades

  • We describe the metamorphosis of RAM3 S [2,3], a framework that we developed to integrate Big Data management platforms, so as to allow researchers to implement real-time complex analyses of massive MM streams exploiting a distributed computing environment

  • The first step towards the definition of a framework for the analysis of massive multimedia streams came from the realization that, in all the use cases we considered, incoming data streams were processed as illustrated in Figure 2: 1

Read more

Summary

Introduction

Multimedia (MM) data have been used in a plethora of applications for at least three decades. The RAM3 S interface to input (and output) streams has been generalized so as to encompass the use of different message brokers (the current version includes the interface to three different message brokers as opposed to the single one originally available); Decoupling RAM3 S from the message broker allows cascading different RAM3 S instances, since the output of a single instance can be used as input to another one This provides a recursive, and richer, model for the processing of MM streams; As an additional feature, we included in RAM3 S another popular Big Data streaming platform, namely Apache Samza, in addition to the ones already present (Spark Streaming, Storm, and Flink); thanks to the wide availability of alternatives provided by this new version of RAM3 S, we are able to experimentally compare performances of different message brokers and stream processing engines on three different real-world applications, demonstrating the handiness of RAM3 S as a platform for testing purposes. We believe that this new version of RAM3 S represents a formidable asset for enabling efficient and effective Data Mining and Machine Learning on MM data streams

Running Example
Paper Outline
In the Beginning
The Message Broker
Communication Models
RabbitMQ
ActiveMQ
Let There Be RAM3 S
Introducing Samza
Samza APIs
Samza and Kafka
Generalizing the Message Broker
Adding Samza
Experimental Results
Message Brokers Comparison
Use Cases
Face Recognition
Printed Text Recognition in Videos
Plate Recognition
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.