Cost-efficient dynamic scheduling of big data applications in apache spark on cloud

Muhammed Tawfiqul Islam,Satish Narayana Srirama,Shanika Karunasekera,Rajkumar Buyya

doi:10.1016/j.jss.2019.110515

Muhammed Tawfiqul Islam, Satish Narayana Srirama + Show 2 more

https://doi.org/10.1016/j.jss.2019.110515

Copy DOI

Abstract

Job scheduling is one of the most crucial components in managing resources, and efficient execution of big data applications. Specifically, scheduling jobs in a cloud-deployed cluster are challenging as the cloud offers different types of Virtual Machines (VMs) and jobs can be heterogeneous. The default big data processing framework schedulers fail to reduce the cost of VM usages in the cloud environment while satisfying the performance constraints of each job. The existing works in cluster scheduling mainly focus on improving job performance and do not leverage from VM types on the cloud to reduce cost. In this paper, we propose efficient scheduling algorithms that reduce the cost of resource usage in a cloud-deployed Apache Spark cluster. In addition, the proposed algorithms can also prioritise jobs based on their given deadlines. Besides, the proposed scheduling algorithms are online and adaptive to cluster changes. We have also implemented the proposed algorithms on top of Apache Mesos. Furthermore, we have performed extensive experiments on real datasets and compared to the existing schedulers to showcase the superiority of our proposed algorithms. The results indicate that our algorithms can reduce resource usage cost up to 34% under different workloads and improve job performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Cost-efficient dynamic scheduling of big data applications in apache spark on cloud

Abstract

Talk to us

Similar Papers

More From: Journal of Systems and Software

Lead the way for us

Journal: Journal of Systems and Software	Publication Date: Dec 27, 2019
Citations: 37

Similar Papers

Job scheduling for big data analytical applications in clouds: A taxonomy study
Youyou Kang ... Shijun Liu
Future Generation Computer Systems | VOL. 135
Youyou Kang, et. al.Youyou Kang ... Shijun Liu
06 May 2022
Future Generation Computer Systems | VOL. 135

Towards Performance Modeling as a Service by Exploiting Resource Diversity in the Public Cloud
Mark Meredith ... Bhuvan Urgaonkar
-
Mark Meredith, et. al.Mark Meredith ... Bhuvan Urgaonkar
01 Jun 2016
01 Jun 2016

AN ENHANCED TASK ALLOCATION STRATEGY IN CLOUD ENVIRONMENT
Kavita Redishettywar ... Prof Rafik Juber Thekiya
INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY | VOL. 16
Kavita Redishettywar, et. al.Kavita Redishettywar ... Prof Rafik Juber Thekiya
18 Aug 2017
INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY | VOL. 16

A NOVEL APPROACH OF OPTIMIZING PERFORMANCE USING K-MEANS CLUSTERING IN CLOUD COMPUTING
Sheenam Kamboj ... Mr Navtej Singh Ghumman
INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY | VOL. 15
Sheenam Kamboj, et. al.Sheenam Kamboj ... Mr Navtej Singh Ghumman
18 Dec 2016
INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cost-efficient dynamic scheduling of big data applications in apache spark on cloud

Abstract

Talk to us

Similar Papers

More From: Journal of Systems and Software