ArcaDB: A Disaggregated Query Engine for Heterogenous Computational Environments.

Kristalys Ruiz-Rohena,Manuel Rodriguez-Martínez

doi:10.1109/cloud62652.2024.00015

Abstract

Modern enterprises rely on data management systems to collect, store, and analyze vast amounts of data related to their operations. Nowadays, clusters and hardware accelerators (e.g., GPUs, TPUs) have become a necessity to scale with the data processing demands in many applications related to social media, bioinformatics, surveillance systems, remote sensing, and medical informatics. Given this new scenario, the architecture of data analytics engines must evolve to take advantage of these new technological trends. In this paper, we present ArcaDB: a disaggregated query engine that leverages container technology to place operators at compute nodes that fit their performance profile. In ArcaDB, a query plan is dispatched to worker nodes that have different computing characteristics. Each operator is annotated with the preferred type of compute node for execution, and ArcaDB ensures that the operator gets picked up by the appropriate workers. We have implemented a prototype version of ArcaDB using Java, Python, and Docker containers. We have also completed a preliminary performance study of this prototype, using images and scientific data. This study shows that ArcaDB can speed up query performance by a factor of 3.5x in comparison with a shared-nothing, symmetric arrangement.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

ArcaDB: A Disaggregated Query Engine for Heterogenous Computational Environments.

Abstract

Talk to us

Similar Papers

More From: Proceedings. IEEE International Conference on Cloud Computing

Lead the way for us

Similar Papers

Data Science in Healthcare: Implications for Early Career Investigators.
Sanjeev P Bhavnani ... Daniel Muñoz
Circulation: Cardiovascular Quality and Outcomes | VOL. 9
Sanjeev P Bhavnani, et. al.Sanjeev P Bhavnani ... Daniel Muñoz
01 Nov 2016
Circulation: Cardiovascular Quality and Outcomes | VOL. 9

A Value of Data Science in the Medical Informatics: An Overview
Nguyen Thi Dieu Linh ... Zhongyu Lu
-
Nguyen Thi Dieu Linh, et. al.Nguyen Thi Dieu Linh ... Zhongyu Lu
01 Jan 2020
01 Jan 2020

Updating professional competencies in health informatics: A scoping review and consultation with subject matter experts
Helen Monkman ... Andre W Kushniruk
International Journal of Medical Informatics | VOL. 170
Helen Monkman, et. al.Helen Monkman ... Andre W Kushniruk
20 Dec 2022
International Journal of Medical Informatics | VOL. 170

The Design of Soft Base Station Based on Docker
Kunheng Wu ... Xianghuang Chen
-
Kunheng Wu, et. al.Kunheng Wu ... Xianghuang Chen
01 Dec 2018
01 Dec 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ArcaDB: A Disaggregated Query Engine for Heterogenous Computational Environments.

Abstract

Talk to us

Similar Papers

More From: Proceedings. IEEE International Conference on Cloud Computing