An incremental reinforcement learning scheduling strategy for data‐intensive scientific workflows in the cloud

André Nascimento,Aline Paes,Vítor Silva,Daniel Oliveira

doi:10.1002/cpe.6193

André Nascimento, Aline Paes + Show 2 more

https://doi.org/10.1002/cpe.6193

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

SummaryMost scientific experiments can be modeled as workflows. These workflows are usually computing‐ and data‐intensive, demanding the use of high‐performance computing environments such as clusters, grids, and clouds. This latter offers the advantage of the elasticity, which allows for changing the number of virtual machines (VMs) on demand. Workflows are typically managed using scientific workflow management systems (SWfMS). Many existing SWfMSs offer support for cloud‐based execution. Each SWfMS has its scheduler that follows a well‐defined cost function. However, such cost functions should consider the characteristics of a dynamic environment, such as live migrations or performance fluctuations, which are far from trivial to model. This article proposes a novel scheduling strategy, named ReASSIgN, based on reinforcement learning (RL). By relying on an RL technique, one may assume that there is an optimal (or suboptimal) solution for the scheduling problem, and aims at learning the best scheduling based on previous executions in the absence of a mathematical model of the environment. For this, an extension of a well‐known workflow simulator WorkflowSim is proposed to implement an RL strategy for scheduling workflows. Once the scheduling plan is generated via simulation, the workflow is executed in the cloud using SciCumulus SWfMS. We conducted a throughout evaluation of the proposed scheduling strategy using a real astronomy workflow named Montage.

Full Text