Abstract

SummaryHadoop is a typical framework for processing big data. Task scheduling algorithms have a significant impact on the processing performance of Hadoop clusters. Existing scheduling algorithms of Hadoop fail to consider the performance differences between nodes in heterogeneous Hadoop clusters, causing problems such as uneven task allocation and low resource utilization. Aiming to solve this problem, we propose a spider monkey optimization‐based scheduling algorithm (SMOSA) for heterogeneous Hadoop. First, the cluster heartbeat mechanism is used to obtain information such as memories and CPUs of nodes to comprehensively consider the actual load capacity of each node. Then, the spider monkey optimization algorithm is adopted to find the optimal mapping relationship between tasks and resources by taking the task completion time as the objective function and updating the position of the spider monkey. Finally, we calculate the remaining rate of node hardware resources, and according to the task type, the node with the higher remaining rate of resource will give priority to the task. Data are compressed for I/O type tasks to reduce disk operations and increase the speed of task execution. Experimental results show that, compared with existing scheduling algorithms, the SMOSA can effectively reduce task execution time and can significantly improve scheduling efficiency and task execution speed especially in heterogeneous Hadoop clusters. For different types of tasks, the execution time can be reduced by up to 19%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.