Abstract

Vast sums of big data is a consequence of the data from different diversity. Conventional data computational frameworks and platforms are incapable to compute complex big data sets and process it at a fast pace. Cloud data centers having massive virtual and physical resources and computing platforms can provide support to big data processing. In addition, most well-known framework, MapReduce in conjunction with cloud data centers provide a fundamental support to scale up and speed up the big data classification, investigation and processing of the huge volumes, massive and complex big data sets. Inappropriate handling of cloud data center resources will not yield significant results which will eventually leads to the overall system’s poor utilization. This research aims at analyzing and optimizing the number of compute nodes following MapReduce framework at computational resources in cloud data center by focusing upon the key issue of computational overhead due to inappropriate parameters selection and reducing overall execution time. The evaluation has been carried out experimentally by varying the number of compute nodes that is, map and reduce units. The results shows evidently that appropriate handling of compute nodes have a significant effect on the overall performance of the cloud data center in terms of total execution time.

Highlights

  • Data sets that are so huge or complex that conventional data processing techniques are incapable to deal with them are called big data

  • The increasing number of challenges of big data are due to its diverse nature which is categorized by its V‟s [1]

  • To deal with ever growing data sets, the giants like Google, IBM, Microsoft and Amazon have ventured their concentration in cloud computing

Read more

Summary

Introduction

Data sets that are so huge or complex that conventional data processing techniques are incapable to deal with them are called big data. As Big data is growing enormously so is its processing requirements It calls for requirements of huge computational infrastructure, in order to successfully analyze and process large amount of data. This is a two pronged challenge, on one hand the amount of data is constantly increasing and its allocation on suitable set of available resources and on the other hand need to yield the output in less time with minimum cost. To deal with ever growing data sets, the giants like Google, IBM, Microsoft and Amazon have ventured their concentration in cloud computing They have offered various services based on cloud computing [2]. These services are accessible on pay per-Use-Demand [3]

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call