Abstract

In this paper, we develop algorithms for the data aggregation problem which arises in the context of big-data applications that employ the MapReduce operation. For the case when source racks can send their data to the aggregator using multiple paths, we show that an aggregation tree topology that minimizes aggregation time can be constructed in polynomial time. We consider also the problem of constructing aggregation trees that minimize total network traffic subject to the primary constraint that aggregation time is minimized. Heuristics for this problem are presented. Experiments show that allowing multiple paths reduces aggregation time by up to 99% relative to the aggregation trees constructed using the LPT rule [3]. This reduction in aggregation time, however, comes with up to 35% increase in total network traffic when racks have more than 2 optical links.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.