Abstract

Distributed stream processing engines (DSPEs) provides various stateless stream partitioning to select the receiver tasks for each message regardless of the data fields. A representative DSPE, Apache Storm, provides the polarized stateless stream partitioning: Shuffle grouping considering the fairness only and Local-or-Shuffle grouping considering the locality only. The recently proposed Locality Aware grouping is a novel technique to solve this polarization. However, it is difficult to select an appropriate stream partitioning method considering various configurations of distributed stream applications, network capacity, and data size. In this paper, we benchmark the stateless stream partitioning methods from the perspective of different network bandwidths. To change bandwidths, we experiment on the most widely used the usual Ethernet equipment and the recent InfiniBand, a high-performance network equipment. We can use the benchmark results as the selection criteria for choosing the appropriate stream partitioning method according to the network bandwidth.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call