Abstract

In the big data era which we have entered, the development of smart scheduler has become a necessity. A Distributed Stream Processing System (DSPS) has the role of assigning processing tasks to the available resources (dynamically or not) and route streaming data between them. Smart and efficient task scheduling can reduce latencies and eliminate network congestions. The most commonly used scheduler is the default Storm scheduler, which has proven to have certain disadvantages, like the inability to handle system changes in a dynamic environment. In such cases, rescheduling is necessary. This paper is an extension of a previous work on dynamic task scheduling. In such a scenario, some type of rescheduling is necessary to have the system working in the most efficient way. In this paper, we extend our previous works Souravlas and Anastasiadou (Appl Sci 10(14):4796, 2020); Souravlas et al. (Appl Sci 11(1):61, 2021) and present a mathematical model that offers better balance and produces fewer communication steps. The scheduler is based on the idea of generating larger sets of communication steps among the system nodes, which we call superclasses. Our experiments have shown that this scheme achieves better balancing and reduces the overall latency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call