Abstract

There is currently a growing interest on Big Data Stream processing. With the increasing capabilities of Internet-based computing systems to generate, store and process Big Data Streams, various applications are benefiting from the information extracted in real time from data streams. Such families of applications comprise IoT-based monitoring systems, decision making systems, recommendation systems, security, etc. In all of them, the common ingredient is that of using real time data analytics for the purpose at hand. The foremost challenge of processing Big Data Streams is that of throughput performance, i.e., the computing system should be able to compute at high throughput to accommodate the high data stream rate generation in input. Additionally to this challenge, the processing should be such that the consistency of streaming is achieved. The consistency of streaming requires that the order of events in the original stream (in input) is preserved also in the output. In this paper we show that using a heterogeneous cluster for Big Data Stream processing could indeed incur into streaming inconsistency and one would indeed have to carefully tune the system so that streaming consistency is achieved. We exemplify the approach using the Yahoo!S4 for processing the Big Data Stream from FlightRadar24 global flight monitoring system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call