Abstract

This paper addresses the problem of auto-parallelization of streaming applications. We propose an online parallelization optimization algorithm that adjusts the degree of pipeline and data parallelism in a joint manner. We define an operator development API and a flexible parallel execution model to form a basis for the optimization algorithm. The operator interface unifies the development of different types of operators and makes operator properties visible in order to enable safe optimizations. The parallel execution model splits a data flow graph into regions. A region contains the longest sequence of compatible operators that are amenable to data parallelism as a whole and can be further parallelized with pipeline parallelism. We also develop a stream processing run-time, named Joker, to scale the execution of streaming applications in a safe, transparent, dynamic, and automatic manner. This ability is called organic adaptation. Joker implements the runtime machinery to execute a data flow graph with any parallelization configuration and most importantly change this configuration at run-time with low cost in the presence of partitioned stateful operators, in a way that is transparent to the application developers. Joker continuously monitors the run-time performance, and runs the optimization algorithm to resolve bottlenecks and scale the application by adjusting the degree of pipeline and data parallelism. The experimental evaluation based on micro-benchmarks and real-world applications showcase that our solution accomplishes elasticity by finding an effective parallelization configuration.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call