Raising the Parallel Abstraction Level for Streaming Analytics Applications

Gabriele Mencagli,Massimo Torquati,Luiz Gustavo L Fernandes,Dalvan Griebler,Marco Danelutto

doi:10.1109/access.2019.2941183

Abstract

In the stream processing domain, applications are represented by graphs of operators arbitrarily connected and filled with their business logic code. The APIs of existing Stream Processing Systems (SPSs) ease the development of transformations that recur in the streaming practice (e.g., filtering, aggregation and joins). In contrast, their parallelism abstractions are quite limited since they provide support to stateless operators only, or when the state is organized in a set of key-value pairs. This paper presents how the parallel patterns methodology can be revisited for sliding-window streaming analytics. Our vision fosters a design process of the application as composition and nesting of ready-to-use patterns provided through a C++17 fluent interface. Our prototype implements the run-time system of the patterns in the FastFlow parallel library expressing thread-based parallelism. The experimental analysis shows interesting outcomes. First, our pattern-based approach allows easy prototyping of different versions of the application, and the programmer can leverage nesting of patterns to increase performance (up to 37% in one of the two considered test-bed cases). Second, our FastFlow implementation outperforms (three times faster) the handmade porting of our patterns in popular JVM-based SPSs. Finally, in the concluding part of this paper, we explore the use of a task-based run-time system, by deriving interesting insights into how to make our patterns library suitable for multi backends.

Highlights

The data deluge generated by our ever-more-connected world raises the need of easy-to-use frameworks able to efficiently process data streams in real-time
Such frameworks should provide high-level user-friendly programming interfaces for easing the developing of efficient streaming applications. They should enable the efficient execution on modern hardware, limited to clusters as in traditional systems like Apache Storm [1] and Apache Flink [2], and on modern powerful scale-up servers equipped with tens of cores and terabytes of memory
In our prior work [5], we proposed a set of parallel patterns targeting continuous analytics based on sliding windows

Summary

INTRODUCTION

The data deluge generated by our ever-more-connected world raises the need of easy-to-use frameworks able to efficiently process data streams in real-time Such frameworks should provide high-level user-friendly programming interfaces for easing the developing of efficient streaming applications. They should enable the efficient execution on modern hardware, limited to clusters as in traditional systems like Apache Storm [1] and Apache Flink [2], and on modern powerful scale-up servers equipped with tens of cores and terabytes of memory. In our prior work [5], we proposed a set of parallel patterns targeting continuous analytics based on sliding windows Such kind of queries are supported in the existing frameworks and represent an essential part of many streaming benchmarks [6].

BACKGROUND

WINDFLOW PARALLEL PATTERNS

NESTING OF PATTERNS AND TRANSFORMATIONS

EXPERIMENTS

FIRST CASE STUDY

SECOND CASE STUDY

Findings

VIII. CONCLUSION AND FUTURE WORK

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 12	License type: CC BY 4.0

R Discovery Prime

Raising the Parallel Abstraction Level for Streaming Analytics Applications

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

When FPGA-Accelerator Meets Stream Data Processing in the Edge
Song Wu ... Die Hu
-
Song Wu, et. al.Song Wu ... Die Hu
01 Jul 2019
01 Jul 2019

Performance Analysis of Large-Scale Distributed Stream Processing Systems on the Cloud
Tri Minh Truong ... Richard O Sinnott
-
Tri Minh Truong, et. al.Tri Minh Truong ... Richard O Sinnott
01 Jul 2018
01 Jul 2018

General-Purpose Big Data Processing Systems
Sherif Sakr
-
Sherif SakrSherif Sakr
01 Jan 2015
01 Jan 2015

Data set for Chinese text automatic generation task
Zhang You Zhang You ... Lilin Lilin
-
Zhang You Zhang You, et. al.Zhang You Zhang You ... Lilin Lilin
19 Jul 2022
19 Jul 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Raising the Parallel Abstraction Level for Streaming Analytics Applications

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Access