Abstract

Abstract Design of smart (or active) systems that perform automated tasks intelligently based on the interaction with their environments requires a collective solution of the physical and control system design problems together. In this paper, we present a model-free on-policy reinforcement learning approach to solve control co-design problems for such smart systems. This approach uses a discrete two timescale reinforcement learning that addresses the control system design in an inner loop with a fast time scale and the physical system design in an outer loop with a slower time scale. Both design problems use the same temporal difference-based Q-learning formulation. We apply this two-time-scale reinforcement approach to the online video game EcoRacer where the physical system involves the design of a gear ratio for an electric vehicle and the control system involves acceleration and braking decisions over time to finish a track with minimum energy consumption within a limited time. The results show the ability of the proposed approach to find the system optimal solution for the EcoRacer case study within a reasonable computation time without requiring any knowledge of the physics governing the system. The proposed method is generalizable and has the potential to take advantage of the ongoing developments in the field of reinforcement learning.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call