A Model and Survey of Distributed Data-Intensive Systems

Alessandro Margara,Stefano Cilloni,Nicolò Felicioni,Gianpaolo Cugola

doi:10.1145/3604801

Abstract

Data is a precious resource in today’s society, and it is generated at an unprecedented and constantly growing pace. The need to store, analyze, and make data promptly available to a multitude of users introduces formidable challenges in modern software platforms. These challenges radically impacted the research fields that gravitate around data management and processing, with the introduction of distributed data-intensive systems that offer innovative programming models and implementation strategies to handle data characteristics such as its volume, the rate at which it is produced, its heterogeneity, and its distribution. Each data-intensive system brings its specific choices in terms of data model, usage assumptions, synchronization, processing strategy, deployment, guarantees in terms of consistency, fault tolerance, and ordering. Yet, the problems data-intensive systems face and the solutions they propose are frequently overlapping. This article proposes a unifying model that dissects the core functionalities of data-intensive systems, and discusses alternative design and implementation strategies, pointing out their assumptions and implications. The model offers a common ground to understand and compare highly heterogeneous solutions, with the potential of fostering cross-fertilization across research communities. We apply our model by classifying tens of systems: an exercise that brings to interesting observations on the current trends in the domain of data-intensive systems and suggests open research directions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Model and Survey of Distributed Data-Intensive Systems

Abstract

Talk to us

Similar Papers

More From: ACM Computing Surveys

Lead the way for us

Journal: ACM Computing Surveys	Publication Date: Aug 26, 2023
Citations: 1

Similar Papers

Performance analysis of data intensive cloud systems based on data management and replication: a survey
Saif Ur Rehman Malik ... Joanna Kolodziej
Distributed and Parallel Databases | VOL. 34
Saif Ur Rehman Malik, et. al.Saif Ur Rehman Malik ... Joanna Kolodziej
14 Mar 2015
Distributed and Parallel Databases | VOL. 34

A Generic and Extensible Core and Prototype of Consistent, Distributed, and Resilient LIS
Zdravko Galić ... Mario Vuzem
ISPRS International Journal of Geo-Information | VOL. 9
Zdravko Galić, et. al.Zdravko Galić ... Mario Vuzem
13 Jul 2020
ISPRS International Journal of Geo-Information | VOL. 9

Architectural Approaches to Overcome Challenges in The Development of Data-Intensive Systems
Aleksandar Dimov ... Tasos Papapostolu
-
Aleksandar Dimov, et. al.Aleksandar Dimov ... Tasos Papapostolu
01 Jan 2021
01 Jan 2021

Leveraging human-centered design and causal pathway diagramming toward enhanced specification and development of innovative implementation strategies: a case example of an outreach tool to address racial inequities in breast cancer screening
Leah M Marcotte ... Gary Hsieh
Implementation Science Communications | VOL. 5
Leah M Marcotte, et. al.Leah M Marcotte ... Gary Hsieh
28 Mar 2024
Implementation Science Communications | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Model and Survey of Distributed Data-Intensive Systems

Abstract

Talk to us

Similar Papers

More From: ACM Computing Surveys