Realizing self-adaptive systems via online reinforcement learning and feature-model-guided exploration

Andreas Metzger,Clément Quinton,Zoltán Ádám Mann,Luciano Baresi,Klaus Pohl

doi:10.1007/s00607-022-01052-x

Abstract

AbstractA self-adaptive system can automatically maintain its quality requirements in the presence of dynamic environment changes. Developing a self-adaptive system may be difficult due to design time uncertainty; e.g., anticipating all potential environment changes at design time is in most cases infeasible. To realize self-adaptive systems in the presence of design time uncertainty, online machine learning, i.e., machine learning at runtime, is increasingly used. In particular, online reinforcement learning is proposed, which learns suitable adaptation actions through interactions with the environment at runtime. To learn about its environment, online reinforcement learning has to select actions that were not selected before, which is known as exploration. How exploration happens impacts the performance of the learning process. We focus on two problems related to how adaptation actions are explored. First, existing solutions randomly explore adaptation actions and thus may exhibit slow learning if there are many possible adaptation actions. Second, they are unaware of system evolution, and thus may explore new adaptation actions introduced during evolution rather late. We propose novel exploration strategies that use feature models (from software product line engineering) to guide exploration in the presence of many adaptation actions and system evolution. Experimental results for two realistic self-adaptive systems indicate an average speed-up of the learning process of 33.7% in the presence of many adaptation actions, and of 50.6% in the presence of evolution.

Highlights

A self-adaptive system can modify its own structure and behavior at runtime based on its perception of the environment, of itself and of its requirements [9,24,34]
To capture changes in the system’s adaptation space due to system evolution, we propose the FM-difference exploration strategy, which leverages the differences in feature models before (M ) and after (M ) an evolution step
5 Results To facilitate reproducibility and replicability, our code, the used data and our experimental results are available online.7 5.1 Results for RQ1 (FM-structure exploration) Figure 3 visualizes the learning process by showing how rewards develop over time, while Table 2 quantifies the learning performance using the metrics introduced above

Summary

Introduction

A self-adaptive system can modify its own structure and behavior at runtime based on its perception of the environment, of itself and of its requirements [9,24,34]. An example is a self-adaptive web service, which faced with a sudden increase in workload, may reconfigure itself by deactivating optional system features. By adapting itself at runtime, the web service is able to maintain its quality requirements (here: performance) under changing workloads. In doing so, software engineers face the challenge of design time uncertainty [6, 45]. Among other concerns, developing the adaptation logic requires anticipating the potential environment states the system may encounter at runtime to define when the system should adapt itself. Anticipating all potential environment states is in most cases infeasible due to incomplete information at design time. The concrete services that will be bound at runtime and their quality are typically not known at design time. Online reinforcement learning is an emerging approach to realize self-adaptive systems in the presence of design time uncertainty. The goal of reinforcement learning is to optimize cumulative rewards

Objectives

Results

Conclusion