Abstract

As Hadoop is progressively becoming widespread technology in large-scale data analysis, there is a growing need for providing inevitable services to users who have strict requirements on job completion times, which are typically capture in service level agreements (i.e., SLA comprising an initial start time, a required execution time, priority and an end-to-end deadline). While earliest deadline first scheduling (EDF) like algorithms are trendy in guarantee job deadlines in real-time systems, they are not successful in a dynamic Hadoop environment. The problem of resource allocation and scheduling is modeled using constraint programming. We develop a distinctive Map Reduce constraint programming based matchmaking and scheduling algorithm (CP-RM) that can handle MapReduce jobs with deadlines and achieve high system performance. The MRCP-RM algorithm is integrated into Hadoop, which is a widely used open source implementation of the MapReduce programming model, as a new scheduler called the CP-RM Scheduler. We analyze the CP-RM Scheduler's performance as a comparison with an earliest deadline first (EDF) Hadoop scheduler, which is implemented by extending Hadoop's default FIFO scheduler. We analyze the CP-Scheduler's performance as a comparison with an earliest deadline first (EDF) Hadoop scheduler, which is implemented by extending Hadoop's default FIFO scheduler. The results of the performance evaluation demonstrate the effectiveness of CP-RM in generating a schedule that leads to a low proportion of jobs missing their deadlines and also provide insights into system behavior and performance. In the experiments performed on a Hadoop cluster deployed on local system, it is observed that CP-RM achieved on average a 56% lower p compared to an EDF-Scheduler for a different of workload.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call