Abstract

The era of big data has led to the emergence of new real-time distributed stream processing engines like Apache Storm. We present Stela (STream processing ELAsticity), a stream processing system that supports scale-out and scale-in operations in an on-demand manner, i.e., when the user requests such a scaling operation. Stela meets two goals: 1) it optimizes post-scaling throughput, and 2) it minimizes interruption to the ongoing computation while the scaling operation is being carried out. We have integrated Stela into Apache Storm. We present experimental results using micro-benchmark Storm applications, as well as production applications from industry (Yahoo! Inc. and IBM). Our experiments show that compared to Apache Storm's default scheduler, Stela's scale-out operation achieves throughput that is 21-120% higher, and interruption time that is significantly smaller. Stela's scale-in operation chooses the right set of servers to remove and achieves 2X-5X higher throughput than Storm's default strategy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call