Abstract

A key feature of distributed stream processing (DSP) systems is the scheduling of operators on clustered computers. In scheduling, the assignment plan of operators to nodes of the cluster, requirements of operators, and the computational power of each worker node must be considered with the goal of finding a tradeoff between the communication latency of operators and the utilization of worker nodes to minimize the overall system response time. To reach this goal is quite challenging, especially in heterogeneous clusters, because there are no accurate estimations about the loads of worker nodes at run time. To address this challenge, we propose a novel stream processing scheduling using ant colony algorithm (SP-Ant). SP-Ant finds the best operator assignment plan considering the inter-node communication latencies of operators by initially collocating highly communicative operators on the same worker nodes using the bin-packing algorithm and iteratively (re-)scheduling only the less communicative operators using the exploration and exploitation phases of the evolutionary ant colony optimization (ACO) algorithm in order to reduce its convergence time. SP-Ant is implemented on the standard Apache Storm. Using several standard benchmark topologies of Storm, it is shown that SP-Ant outperforms the R-Storm and Storm default schedulers by at least 50% in reducing the response time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call