STRETCH: Virtual Shared-Nothing Parallelism for Scalable and Elastic Stream Processing

Vincenzo Gulisano,Yiannis Nikolakopoulos,Philippas Tsigas,Hannaneh Najdataei,Marina Papatriantafilou,Alessandro V Papadopoulos

doi:10.1109/tpds.2022.3181979

Abstract

Stream processing applications extract value from raw data through Directed Acyclic Graphs of data analysis tasks. Shared-nothing (SN) parallelism is the de-facto standard to scale stream processing applications. Given an application, SN parallelism ins9tantiates several copies of each analysis task, making each instance responsible for a dedicated portion of the overall analysis, and relies on dedicated queues to exchange data among connected instances. On the one hand, SN parallelism can scale the execution of applications both up and out since threads can run task instances within and across processes/nodes. On the other hand, its lack of sharing can cause unnecessary overheads and hinder the scaling up when threads operate on data that could be jointly accessed in shared memory. This trade-off motivated us in studying a way for stream processing applications to leverage shared memory and boost the scale up (before the scale out) while adhering to the widely-adopted and SN-based APIs for stream processing applications. We introduce <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">STRETCH</i> , a framework that maximizes the scale up and offers instantaneous elastic reconfigurations (without state transfer) for stream processing applications. We propose the concept of Virtual Shared-Nothing (VSN) parallelism and elasticity and provide formal definitions and correctness proofs for the semantics of the analysis tasks supported by <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">STRETCH</i> , showing they extend the ones found in common Stream Processing Engines. We also provide a fully implemented prototype and show that <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">STRETCH</i> 's performance exceeds that of state-of-the-art frameworks such as Apache Flink and offers, to the best of our knowledge, unprecedented ultra-fast reconfigurations, taking less than 40 ms even when provisioning tens of new task instances.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

STRETCH: Virtual Shared-Nothing Parallelism for Scalable and Elastic Stream Processing

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems

Lead the way for us

Journal: IEEE Transactions on Parallel and Distributed Systems	Publication Date: Dec 1, 2022
Citations: 4

Similar Papers

Latency-Aware Secure Elastic Stream Processing with Homomorphic Encryption
Arosha Rodrigo ... Miyuru Dayarathna
Data Science and Engineering | VOL. 4
Arosha Rodrigo, et. al.Arosha Rodrigo ... Miyuru Dayarathna
01 Sep 2019
Data Science and Engineering | VOL. 4

Privacy Preserving Elastic Stream Processing with Clouds Using Homomorphic Encryption
Arosha Rodrigo ... Sanath Jayasena
-
Arosha Rodrigo, et. al.Arosha Rodrigo ... Sanath Jayasena
01 Jan 2019
01 Jan 2019

Enabling Elastic Stream Processing in Shared Clusters
Jack Li ... Dejan Milojicic
-
Jack Li, et. al.Jack Li ... Dejan Milojicic
01 Jun 2016
01 Jun 2016

Combining stream with data parallelism abstractions for multi-cores
Júnior Löff ... Luiz G Fernandes
Journal of Computer Languages | VOL. 73
Júnior Löff, et. al.Júnior Löff ... Luiz G Fernandes
01 Dec 2022
Journal of Computer Languages | VOL. 73

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

STRETCH: Virtual Shared-Nothing Parallelism for Scalable and Elastic Stream Processing

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems