Abstract

Embedded control parameters of cyber-physical systems (CPS), such as sampling rate, are typically invariant and designed with a worst case scenario in mind. In an over-engineered system, control parameters are assigned values that satisfy system-wide performance requirements at the expense of excessive energy and resource overheads. Dynamic and adaptive control parameters can reduce the overhead but are complex and require in-depth knowledge of the CPS and its operating environment - which typically is unavailable during design time. The authors investigate the application of reinforcement learning (RL) to dynamically adapt high-level system parameters, at run time, as a function of the system state. RL is an alternative approach to the classical control theory for CPSs that can learn and adapt control properties without the need of an in-depth controller model. Specifically, we show that RL can modulate sampling times to save processing power without compromising control quality. We apply a novel statistical cloud-based evaluation framework to study the validity of our approach for the cart-pole balancing control problem as well as the well-known mountain car problem. The results show an improved real-world power efficiency of up to 20% compared with an optimal system with fixed controller settings.

Highlights

  • In a cyber-physical system (CPS), most generally, a physical system is controlled by an embedded control system (ECS)

  • We introduce the specific variable sampling time ECS (VS-ECS) as an example of A-ECSs where the controller sampling time is changed in real time to realise more efficient embedded control in CPS applications

  • We demonstrated the suitability of reinforcement learning (RL) to adapt software properties of the ECS at run time

Read more

Summary

Introduction

In a cyber-physical system (CPS), most generally, a physical system is controlled by an embedded control system (ECS). The system parameters are set to work in the most challenging (worst case) scenario, for which the designer validates the stability of the system. Such overengineering results in resource usage inefficiency, for example when the sampling rate designed for temporary high-bandwidth disturbances or non-linear dynamics exceeds the required value for the current system state. We investigate the feasibility and the effect of online adaptation of ECS parameters to improve the resource utilisation and energy consumption of the ECS and the entire CPS.

Related work
Reinforcement learning
Linear approximation of continuous value functions
A-ECS based on RL
A-ECS RL environment and actions
Cloud-based evaluation framework
A-ECS development workflow
Case study 1: cart-pole swing up task
Cart-pole dynamics
Processing power modelling
Simulation results
Swing-up and balance task
Balance-only task
Comparison to ETC
Problem definition
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call