Abstract
The Map reduce is a programming model for handling and processing the huge data sets using map and reduce tasks in parallel distributing. To increase the execution of mapreduce many number of activities have been made, but they ignore to deal with network traffic produced in shuffle stage. The existing mapreduce traffic-aware partitions suffer from partition skew issue, where the output of map tasks is unevenly distributed among reduces tasks. Existing arrangements take after a comparative rule that re partitions workload among diminish undertakings. In any case, those methodologies frequently cause elite overhead because of the segment estimate expectation and re-partitioning. The proposed work chooses dynamic data aware parallel with k-Means algorithm (DDAP-kM), a framework that provides dynamic partitioning skew reduction and clustering map reduce jobs. These works cope with partitioning skew by adjusting run time resource allocation to reduce tasks. By the experimental results network traffic cost is compared in terms of traffic aware partition algorithm and DDAP-kM algorithm.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have