Abstract

The MapReduce programming model simplifies large-scale processing on trade goods cluster by exploiting parallel map tasks and cut back tasks. though several efforts are created to boost the performance of MapReduce jobs, they ignore the network traffic generated within the shuffle part, that plays a crucial role in performance improvement.Historically, a hash perform is employed to partition intermediate knowledge among cut back tasks, which, however, isn't traffic-efficient as a result of configuration and knowledge size related to every key don't seem to be taken into thought. During this paper, we have a tendency to study to scale back network traffic value for a MapReduce job by coming up with a unique intermediate knowledge partition theme. what is more, we have a tendency to conjointly contemplate the aggregation placement downside, wherever every aggregation will cut back united traffic from multiple map tasks. A decomposition-based distributed algorithmic rule is projected to subsume the large-scale improvement downside for large knowledge application and a web algorithmic rule is additionally designed to regulate knowledge partition and aggregation in a very dynamic manner. Finally, in depth simulation results demonstrate that our proposals will considerably cut back network traffic value below each offline and on-line cases.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.