Abstract

As online services may keep evolving, service composition should maintain certain adaptivity especially for a dynamic composition environment. Meanwhile, the large number of potential candidate services poses scalability concerns, which demand efficient composition solutions. This paper presents a multi-agent reinforcement learning model for Web service composition that effectively addresses the above challenges. In particular, we model a service composition as a Markov Decision Process. Based on the model, agents in a team would benefit from one another. In contrast to single-agent reinforcement-learning, our method can speed up the convergence to an optimal policy. We develop two multi-agent reinforcement learning algorithms. The first one introduces the concept of articulate state and distributed Q-learning to speed up the convergence time. The second one proposes the experience sharing strategy to improve the efficiency of learning. As the learning process continues throughout the life-cycle of a service composition, our algorithms can automatically adapt to the change of environment and the evolving component services. We conduct a simulation study to compare our algorithm with other similar reinforcement learning algorithms, including the traditional Q-learning algorithm, a multi-agent Sarsa algorithm, a Q-learning algorithm based on gaussian process, and a multi-agent Q-learning algorithm, to justify the effectiveness of our model and algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call