Abstract

We propose a Markov decision process model for solving the Web service composition (WSC) problem. Iterative policy evaluation, value iteration, and policy iteration algorithms are used to experimentally validate our approach, with artificial and real data. The experimental results show the reliability of the model and the methods employed, with policy iteration being the best one in terms of the minimum number of iterations needed to estimate an optimal policy, with the highest Quality of Service attributes. Our experimental work shows how the solution of a WSC problem involving a set of 100,000 individual Web services and where a valid composition requiring the selection of 1,000 services from the available set can be computed in the worst case in less than 200 seconds, using an Intel Core i5 computer with 6 GB RAM. Moreover, a real WSC problem involving only 7 individual Web services requires less than 0.08 seconds, using the same computational power. Finally, a comparison with two popular reinforcement learning algorithms, sarsa and Q-learning, shows that these algorithms require one or two orders of magnitude and more time than policy iteration, iterative policy evaluation, and value iteration to handle WSC problems of the same complexity.

Highlights

  • A Web service is a software system designed to support interoperable machine-to-machine interaction over a network, with an interface described in a machine-processable format called Web Services Description Language [1]

  • All 3 algorithms found the solution for the Web service composition very quickly, in less than 0.07 seconds, with policy iteration being the winner

  • In this paper we have proposed an Markov decision processes (MDPs) model to address the Web service composition problem

Read more

Summary

Introduction

A Web service is a software system designed to support interoperable machine-to-machine interaction over a network, with an interface described in a machine-processable format called Web Services Description Language [1]. It is important to remember that reinforcement learning methods [25] belong to a family of algorithms highly related to the MDPs. The main difference with these methods is that the state transition function is assumed to be unknown and the agents need to explore their state and action spaces by executing different actions in different states and observe the numerical rewards obtained after each state transition. In this paper we propose an MDP model to solve the WSC problem. All three algorithms provided good solutions, the policy iteration algorithm required the minimum number of iterations to converge to the optimal solutions We compared these three algorithms against sarsa and Q-learning, showing that the latter methods require one or two orders of magnitude and more time to solve composition problems of the same complexity.

Markov Decision Processes
Web Service Composition Model
Experimental Evaluation
Comparison with Sarsa and Q-Learning
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call