Abstract

Although the composition of scientific workflows has been widely studied, there is still a lack of a general and efficient approach for automatic composition of scientific workflows. In this chapter, we present a STRIPS-based formal definition of the scientific workflow composition problem, followed by an algorithm for automatic composition of high quality (portable, fault tolerant, and optimized) scientific workflows. The algorithm consists of two sub-algorithms dealing with control and data flow composition, respectively. The automatic control flow composition algorithm searches for Activity Function (AFs) and automatically composes them into scientific workflows using an AF Data Dependence (ADD) graph. The composition process consists of three phases: ADD graph creation, workflow extraction, and workflow optimization. The worst case complexity of the algorithm is quadratic in the number of AFs. An extension of the algorithm to compose scientific workflows with branches and loops is also presented. Once control flow is established, the data flow composition algorithm composes data flow of scientific workflows by locating possible source data ports of each sink data port through backwards control flow traversing, and matching source data ports against sink data ports based on data semantics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call