Control and anti-control of chaos based on the moving largest Lyapunov exponent using reinforcement learning

Yanyan Han,Jianpeng Ding,Lin Du,Youming Lei

doi:10.1016/j.physd.2021.133068

Yanyan Han, Jianpeng Ding + Show 2 more

Open Access

https://doi.org/10.1016/j.physd.2021.133068

Copy DOI

Abstract

In this work, we propose a method of control and anti-control of chaos based on the moving largest Lyapunov exponent using reinforcement learning. In this method, we design a reward function for the reinforcement learning according to the moving largest Lyapunov exponent, which is similar to the moving average but computes the corresponding largest Lyapunov exponent using a recently updated time series with a fixed, short length. We adopt the density peaks-based clustering algorithm to determine a linear region of the average divergence index so that we can obtain the largest Lyapunov exponent of the small data set by fitting the slope of the linear region. We show that the proposed method is fast and easy to implement through controlling and anti-controlling typical systems such as the Henon map and Lorenz system.

Highlights

Chaos is a ubiquitous phenomenon in nature, with randomness, local instability, sensitivity on initial conditions, long-term unpredictability and global boundedness
This study considers the control and anti-control of chaos using a Reinforcement learning (RL) based method
We show with numerical simulations that the method can quickly achieve the aim of controlling chaos, but sometimes fails for the anti-control of chaos due to the improper choice of the state’s period parameter making the systems fall into another periodic orbits

Summary

Introduction

Chaos is a ubiquitous phenomenon in nature, with randomness, local instability, sensitivity on initial conditions, long-term unpredictability and global boundedness. A natural question is whether one can apply the reinforcement learning based method to the anti-control of chaos by appropriately designing states, actions and reward functions for the agents. State transition probabilities are hard to obtain On this condition, one should seek the optimal strategy to control or anti-control chaos by using a model-free method. Gadaleta and Dangelmayr introduced a chaos control algorithm using the TD method in RL, in which the optimal state-action value function is approximated by Q or Sarsa learning, and they showed the performance of Q-learning is better than that of. We select Q-learning as the control or anti-control algorithm and its state-action value function is updated with the following equation.

Numerical simulation

Control policy with the moving largest Lyapunov exponent method

Findings

Conclusions and discussions