Information Aggregation for Constrained Online Control

Tongxin Li,Yue Chen,Bo Sun,Adam Wierman,Steven H Low

doi:10.1145/3460085

Abstract

This paper considers an online control problem involving two controllers. A central controller chooses an action from a feasible set that is determined by time-varying and coupling constraints, which depend on all past actions and states. The central controller's goal is to minimize the cumulative cost; however, the controller has access to neither the feasible set nor the dynamics directly, which are determined by a remote local controller. Instead, the central controller receives only an aggregate summary of the feasibility information from the local controller, which does not know the system costs. We show that it is possible for an online algorithm using feasibility information to nearly match the dynamic regret of an online algorithm using perfect information whenever the feasible sets satisfy a causal invariance criterion and there is a sufficiently large prediction window size. To do so, we use a form of feasibility aggregation based on entropic maximization in combination with a novel online algorithm, named Penalized Predictive Control (PPC) and demonstrate that aggregated information can be efficiently learned using reinforcement learning algorithms. The effectiveness of our approach for closed-loop coordination between central and local controllers is validated via an electric vehicle charging application in power systems.

Highlights

The use of online learning methods for controlling dynamical systems has captured increasing attention from both the learning and control communities
Denoting by d the diameter of the action space U, T the number of total time steps and w the number of predictions available, we show that the dynamic regret of any deterministic policy must satisfy a lower bound on Regret(u) = Ω (d (T − w)) for any feasible sequence of actions u generated by the deterministic policy, even if it has full information of the constraints
8 CONCLUDING REMARKS AND FUTURE DIRECTIONS This paper studies and analyzes the closed-loop control framework created by the interaction between a central controller and a local controller

Summary

INTRODUCTION

The use of online learning methods for controlling dynamical systems has captured increasing attention from both the learning and control communities. Denoting by d the diameter of the action space U, T the number of total time steps and w the number of predictions available, we show that the dynamic regret of any deterministic policy must satisfy a lower bound on Regret(u) = Ω (d (T − w)) for any feasible sequence of actions u generated by the deterministic policy, even if it has full information of the constraints Note that it is well-known that, in the worst case, a sub-linear dynamic regret without the use of predictions is impossible (cf [24]). In the case of stochastic long-term constraints, the authors in [49] achieve O ( T logT ) regret and constraint violations with high probability Both bandit and gradient feedback are not designed to deal with time-coupling constraints and there are no results providing guaranteed performance for the general setting in (1). |ct (u) − ct (v)| ≤ Lc ||u − v ||2 for all u, v ∈ U and t ∈ [T ]

Dynamic regret

Example

INFORMATION AGGREGATION

Predictions

Aggregation Feedback

Key Idea

Algorithm

Framework

RESULTS

Feasibility

A fundamental limit

Causally invariant safety constraints

LEARNING MAXIMUM ENTROPY FEEDBACK AND PREDICTIONS

CASE STUDY

Results

CONCLUDING REMARKS AND FUTURE DIRECTIONS

Result

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Proceedings of the ACM on Measurement and Analysis of Computing Systems	Publication Date: Jun 1, 2021
Citations: 4	License type: cc-by

R Discovery Prime

R Discovery Prime

Information Aggregation for Constrained Online Control

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Proceedings of the ACM on Measurement and Analysis of Computing Systems

Lead the way for us

Similar Papers

Information Aggregation for Constrained Online Control
Steven Low ... Yue Chen
-
Steven Low, et. al.Steven Low ... Yue Chen
31 May 2021
31 May 2021

Information Aggregation for Constrained Online Control
Adam Wierman ... Bo Sun
ACM SIGMETRICS Performance Evaluation Review | VOL. 49
Adam Wierman, et. al.Adam Wierman ... Bo Sun
22 Jun 2021
ACM SIGMETRICS Performance Evaluation Review | VOL. 49

Control scheme for a cascaded multilevel converter used in low-voltage-ride-through tests of grid-connected wind turbines
Jorge Massing ... Benhur Tessele
-
Jorge Massing, et. al.Jorge Massing ... Benhur Tessele
01 Jun 2017
01 Jun 2017

Developing a Petri-net-based simulation model for a modified hierarchical shop floor control framework
C.M Shieh ... C Ou-Yang
International Journal of Production Research | VOL. 37
C.M Shieh, et. al.C.M Shieh ... C Ou-Yang
01 Sep 1999
International Journal of Production Research | VOL. 37

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Information Aggregation for Constrained Online Control

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Proceedings of the ACM on Measurement and Analysis of Computing Systems