Learning the Optimal Control for Evolving Systems with Converging Dynamics

Qingsong Liu,Zhixuan Fang

doi:10.1145/3656007

Abstract

We consider a principle or controller that can pick actions from a fixed action set to control an evolving system with converging dynamics. The actions are interpreted as different configurations or policies. We consider systems with converging dynamics, i.e., if the principle holds the same action, the system will asymptotically converge (possibly requiring a significant amount of time) to a unique stable state determined by this action. This phenomenon can be observed in diverse domains such as epidemic control, computing systems, and markets. In our model, the dynamics of the system are unknown to the principle, and the principle can only receive bandit feedback (maybe noisy) on the impacts of his actions. The principle aims to learn which stable state yields the highest reward while adhering to specific constraints (i.e., optimal stable state) and to immerse the system into this state as quickly as possible. A unique challenge in our model is that the principle has no prior knowledge about the stable state of each action, but waits for the system to converge to the suboptimal stable states costs valuable time. We measure the principle's performance in terms of regret and constraint violation. In cases where the action set is finite, we propose a novel algorithm, termed Optimistic-Pessimistic Convergence and Confidence Bounds (OP-C2B), that knows to switch an action quickly if it is not worth waiting until the stable state is reached. This is enabled by employing "convergence bounds" to determine how far the system is from the stable states, and choosing actions through maintaining a pessimistic assessment of the set of feasible actions while acting optimistically within this set. We establish that OP-C2B can ensure sublinear regret and constraint violation simultaneously. Particularly, OP-C2B achieves logarithmic regret and constraint violation when the system convergence rate is linear or superlinear. Furthermore, we generalize our algorithm OP-C2B to the case of an infinite action set and demonstrate its ability to maintain sublinear regret and constraint violation. We finally show two game control problems including mobile crowdsensing and resource allocation that our model can address.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning the Optimal Control for Evolving Systems with Converging Dynamics

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ACM on Measurement and Analysis of Computing Systems

Lead the way for us

Journal: Proceedings of the ACM on Measurement and Analysis of Computing Systems	Publication Date: May 21, 2024
License type: cc-by

Similar Papers

Learning the Optimal Control for Evolving Systems with Converging Dynamics
Qingsong Liu ... Zhixuan Fang
ACM SIGMETRICS Performance Evaluation Review | VOL. 52
Qingsong Liu, et. al.Qingsong Liu ... Zhixuan Fang
11 Jun 2024
ACM SIGMETRICS Performance Evaluation Review | VOL. 52

Combination inhaled corticosteroids and long-acting beta2-agonists for children and adults with bronchiectasis
Vikas Goyal ... Anne B Chang
-
Vikas Goyal, et. al.Vikas Goyal ... Anne B Chang
31 Jan 2013
31 Jan 2013

On the Effects of Nonparallelism, Three‐Dimensionality, and Mode Interaction in Nonlinear Boundary‐Layer Stability
P Hall ... F T Smith
Studies in Applied Mathematics | VOL. 70
P Hall, et. al.P Hall ... F T Smith
01 Apr 1984
Studies in Applied Mathematics | VOL. 70

Hydra effects in stable food chain models
Debprasad Pal ... Tapan Kumar Kar
Biosystems | VOL. 185
Debprasad Pal, et. al.Debprasad Pal ... Tapan Kumar Kar
19 Aug 2019
Biosystems | VOL. 185

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning the Optimal Control for Evolving Systems with Converging Dynamics

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ACM on Measurement and Analysis of Computing Systems