Abstract

Hadoop is a well-known parallel computing system for distributed computing and large-scale data processes. “Straggling” tasks, however, have a serious impact on task allocation and scheduling in a Hadoop system. Speculative Execution (SE) is an efficient method of processing “Straggling” Tasks by monitoring real-time running status of tasks and then selectively backing up “Stragglers” in another node to increase the chance to complete the entire mission early. Present speculative execution strategies meet challenges on misjudgement of “Straggling” tasks and improper selection of backup nodes, which leads to inefficient implementation of speculative executive processes. This paper has proposed an Optimized Resource Scheduling strategy for Speculative Execution (ORSE) by introducing non-cooperative game schemes. The ORSE transforms the resource scheduling of backup tasks into a multi-party non-cooperative game problem, where the tasks are regarded as game participants, whilst total task execution time of the entire cluster as the utility function. In that case, the most benefit strategy can be implemented in each computing node when the game reaches a Nash equilibrium point, i.e., the final resource scheduling scheme to be obtained. The strategy has been implemented in Hadoop-2.x. Experimental results depict that the ORSE can maintain the efficiency of speculative executive processes and improve fault-tolerant and computation performance under the circumstances of Normal Load, Busy Load and Busy Load with Skewed Data.

Highlights

  • In recent years, from the pace of Internet information technology to the booming trend of CMC. doi:10.32604/cmc.2020.04604 www.techscience.com/journal/cmc e-commerce, Internet data information has been rapidly expanding, the big data storage and processing platform emerged as the times require [Zafar, Khan, Malik et al (2017); Hashem, Yaqoob, Anuar et al (2015); Lee (2013)]

  • Based on the whole study of speculative execution, we proposed an Optimized Resource Scheduling model for Speculative Execution based on a non-cooperative Game theory (ORSE) that introduced the idea of game theory

  • In the Optimized Resource Scheduling strategy for Speculative Execution (ORSE) algorithm, the resource scheduling model of the backup task in execution is transformed into a classic multi-party noncooperative game problem, the game participants are the backup task group and the game strategies are the node in the cluster, the game’s utility function is the cluster’s overall task execution time, and when the game reaches the Nash equilibrium, the task scheduling scheme will be obtained

Read more

Summary

Introduction

From the pace of Internet information technology to the booming trend of. With the continuous improvement and development of the Hadoop platform, many applications based on HDFS and MapReduce are becoming more and more abundant, such as HBase [Apache hive (2018)] and Hive [Bhupathiraju and Ravuri (2015)] etc., which aim at improving the performance of the cluster and allow people to store and process data more These applications are based on the Hadoop distributed storage framework “HDFS” [Chang, Dean, Ghemawat et al (2008)] and the computing framework “MapReduce” [Dean and Ghemawat (2008)].many famous IT companies like Microsoft, Yahoo!, Google, Amazon, have launched their own big data storage and computing platforms such as Storm [Toshniwal, Taneja, Shukla et al (2014)], Spark [Zaharia, Chowdhury, Franklin et al (2010)], Dryad [Isard, Budiu, Birrell et al (2007)], and let the development of big data platform optimization technology as the core development trend in the future [Storey and Song (2017)]. In the ORSE algorithm, the resource scheduling model of the backup task in execution is transformed into a classic multi-party noncooperative game problem, the game participants are the backup task group and the game strategies are the node in the cluster, the game’s utility function is the cluster’s overall task execution time, and when the game reaches the Nash equilibrium, the task scheduling scheme will be obtained

Related works
Model and algorithm
Implementation and critical steps of the resource scheduling algorithm model
Experiments and evaluation
Performance evaluation metrics
Performance of the ORSE strategy in the heterogeneous environment
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.