Abstract

Spark is a framework used to analyze big data applications. In this paper, we introduce a framework to build complex Spark applications by composing simpler ones. We use two levels of granularity for composition. The fine (resp. coarse) granularity focuses on composing sub-Spark (resp. Spark) applications to build a more complex one. Composition takes as input a configuration file that defines the connection between sub-spark and Spark applications. Moreover, in case of composing sub-Spark applications, we introduce different scenarios to automatically persist and un-persist most used data to achieve a better performance. We also present a method to parameterize a system consisting of several Spark applications with respect to their quality of executions. Then, we introduce several strategies to dynamically select the maximum quality levels to execute the given Spark applications, while meeting a user-defined deadline. We present experimental results showing the effectiveness of our method with respect to composition, performance and quality of service of Spark applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call