Composable and efficient functional big data processing framework

Dongyao Wu,Sherif Sakr,Qinghua Lu,Liming Zhu

doi:10.1109/bigdata.2015.7363765

Abstract

Over the past years, frameworks such as MapRe-duce and Spark have been introduced to ease the task of developing big data programs and applications. However, the jobs in these frameworks are roughly defined and packaged as executable jars without any functionality being exposed or described. This means that deployed jobs are not natively composable and reusable for subsequent development. Besides, it also hampers the ability for applying optimizations on the data flow of job sequences and pipelines. In this paper, we present the Hierarchically Distributed Data Matrix (HDM) which is a functional, strongly-typed data representation for writing composable big data applications. Along with HDM, a runtime framework is provided to support the execution of HDM applications on distributed infrastructures. Based on the functional data dependency graph of HDM, multiple optimizations are applied to improve the performance of executing HDM jobs. The experimental results show that our optimizations can achieve improvements of between 10% to 60% of the Job-Completion-Time for different types of operation sequences when compared with the current state of art, Apache Spark.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Composable and efficient functional big data processing framework

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

HDM: A Composable Framework for Big Data Processing
Dongyao Wu ... Qinghua Lu
IEEE Transactions on Big Data | VOL. 4
Dongyao Wu, et. al.Dongyao Wu ... Qinghua Lu
01 Jun 2018
IEEE Transactions on Big Data | VOL. 4

Explore Big Data Analytics Applications and Opportunities: A Review
Zaher Ali Al-Sai ... Rasha Moh’D Sadeq Abdin
Big Data and Cognitive Computing | VOL. 6
Zaher Ali Al-Sai, et. al.Zaher Ali Al-Sai ... Rasha Moh’D Sadeq Abdin
14 Dec 2022
Big Data and Cognitive Computing | VOL. 6

Chapter 7 - Public Transportation Big Data Mining and Analysis
Xiaolei Ma ... Xi Chen
Data-Driven Solutions to Transportation Problems | VOL. -
Xiaolei Ma, et. al.Xiaolei Ma ... Xi Chen
07 Dec 2018
Data-Driven Solutions to Transportation Problems | VOL. -

The Deep Learning and Apache Spark Enabled Architecture for Improving the Performance of Big Data Classification
Anilkumar V Brahmane ... Dr B Chaitanya Krishna
International Journal of Innovative Technology and Exploring Engineering | VOL. 8
Anilkumar V Brahmane, et. al.Anilkumar V Brahmane ... Dr B Chaitanya Krishna
30 Sep 2019
International Journal of Innovative Technology and Exploring Engineering | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Composable and efficient functional big data processing framework

Abstract

Talk to us

Similar Papers