A reference architecture for serverless big data processing

Sebastian Werner,Stefan Tai

doi:10.1016/j.future.2024.01.029

Abstract

Despite significant advances in data management systems in recent decades, the processing of big data at scale remains very challenging. While cloud computing has been well-accepted as a solution to address scalability needs, cloud configuration and operation complexity persist and often present themselves as entry barriers, especially for novice data analysts. Serverless computing and Function-as-a-Service (FaaS) platforms have been suggested to reduce such entry barriers by shifting configuration and operational responsibilities from the application developer to the FaaS platform provider. Naturally, “serverless data processing (SDP)”, that is, using FaaS for (big) data processing, has received increasing interest in recent years.However, FaaS platforms were never intended to support large data processing tasks primarily. SDP, therefore, manifests itself through workarounds and adaptations on the application level, addressing various quirks and limitations of the FaaS platforms in use for data processing needs. This, in turn, creates tensions between the platforms and the applications using them, again encouraging the constant (re-)design of both. Consequently, we present lessons learned from a series of application and platform re-designs that address these tensions, leading to the development of an SDP reference architecture and a platform instantiation and implementation thereof called CREW. Mitigating the tensions through the process of application platform co-design proves to reduce both entry barriers and costs significantly. In some experiments, CREW outperforms traditional, non-SDP big data processing frameworks by factors.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Future Generation Computer Systems	Publication Date: Jan 30, 2024
Citations: 1	License type: cc-by-nc

R Discovery Prime

R Discovery Prime

A reference architecture for serverless big data processing

Abstract

Talk to us

Similar Papers

More From: Future Generation Computer Systems

Lead the way for us

Similar Papers

The problem of analysis of big web data and the use of data mining technology for processing and searching patterns in big web data on a practical example
K V Mulyukova ... V M Kureichik
Open Education | VOL. 23
K V Mulyukova, et. al.K V Mulyukova ... V M Kureichik
14 May 2019
Open Education | VOL. 23

Cloud computing and big data: Technologies and applications
Mostapha Zbakh ... Mohamed Essaaidi
Concurrency and Computation: Practice and Experience | VOL. 29
Mostapha Zbakh, et. al.Mostapha Zbakh ... Mohamed Essaaidi
29 Mar 2017
Concurrency and Computation: Practice and Experience | VOL. 29

An Evaluation of FaaS Platforms as a Foundation for Serverless Big Data Processing
Jörn Kuhlenkamp ... Sebastian Werner
-
Jörn Kuhlenkamp, et. al.Jörn Kuhlenkamp ... Sebastian Werner
02 Dec 2019
02 Dec 2019

Big data processing and analysis platform for condition monitoring of electric power system
Yuanjun Guo ... Yong Wang
-
Yuanjun Guo, et. al.Yuanjun Guo ... Yong Wang
01 Aug 2016
01 Aug 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A reference architecture for serverless big data processing

Abstract

Talk to us

Similar Papers

More From: Future Generation Computer Systems