Abstract

Improving the performance of network traffic in shuffle phase is important to improve the performance of MapReduce. The goal of enhancement of network traffic is achieved by using partition and aggregation. According to traditional method a hash function is used to partition intermediate data among reduce tasks but the traditional function is not efficient to handle network traffic. A novel intermediate data partition scheme is designed to reduce network traffic cost in MapReduce. The aggregator placement problem is considered, where each aggregator can reduce merged traffic from multiple map tasks. A decomposition-based distributed algorithm is proposed to deal with the large-scale optimization problem for big data applications. Also an online algorithm is designed to adjust data partition and aggregation in a dynamic manner. Network traffic cost under both offline and online cases is significantly reduced as demonstrated by the stimulation results by the various proposal considered and used.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.