Abstract

In today, the size of the data is increasing at a random speed. So, this leads to processing of Big data. When we compare this in business applications where the volume of data is huge and at the same time it should be processed in efficient manner. Traditional system fails to process the bigdata because most of the data in bigdata is unstructured. To improve performance in distributed data processing resource utilization plays vital role. There are resource gaps develop while execution occurs. This is more frequent in heterogeneous environment. In the previous techniques there is wastage or not efficient usage of resources. To process data in distributed environment multiple platforms used such as Apache Hadoop, Apache Spark etc. Here we develop new algorithm that reduces the usage of resources and increases the performances. The algorithm implemented in Apache Spark distributed environment. The experimental results indicate efficient utilization of resources and increase in performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call