Abstract

Self-Preserving Genetic Algorithms (SPGA) combine the evolutionary strategy of a genetic algorithm with safety assurance methods commonly implemented in safe reinforcement learning (SRL), a branch of reinforcement learning (RL) that accounts for safety in the exploration and decision-making process of the agent. Safe learning approaches are especially important in safety-critical environments, where failure to account for the safety of the controlled system could result in the loss of millions of dollars in hardware or bodily harm to people working nearby, as is true of many cyber-physical systems. While SRL is a viable approach to safe learning, there are many challenges that must be taken into consideration when training agents, such as sample efficiency, stability, and exploration---an issue that is easily addressed by the evolutionary strategy of a genetic algorithm. By combining GAs with the safety mechanisms used with SRL, SPGA offers a safe learning alternative that is able to explore large areas of the solution space, addressing SRL's challenge of exploration. This work implements SPGA with both action masking and run time assurance safety strategies to evolve safe controllers for three types of discrete action space environments applicable to cyber physical systems (control, routing, and operations) and under various safety conditions. Training and testing evaluation metrics are compared with results from SRL trained controllers to validate results. SPGA and SRL controllers are trained across 5 random seeds and evaluated on 500 episodes to calculate average wall time to train, average expected return, and percentage of safe action evaluation metrics. SPGA achieves comparable reward and safety performance results with significantly improved training efficiency (55x faster on average), demonstrating the effectiveness of this safe learning approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call