Abstract

Big data analytics (BDA) applications are software applications that process huge amounts of data using large-scale parallel processing infrastructure to obtain hidden value. Hadoop is the most mature open source BDA processing framework, which implements the MapReduce programming paradigm. In many cases, BDA jobs are continuous and not mutually separated. Existing work on processing jobs in sequence are inefficient with high energy consumption. In this paper, we propose a genetic algorithm based job scheduling model to improve the efficiency of BDA. To implement the scheduling model, we leverage the estimation module to predict the performance of clusters when processing jobs. We have evaluated the proposed job scheduling model in terms of feasibility and performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call