Abstract

Reinforcement learning (RL) is a powerful tool for training agents to perform complex tasks. However, from time-to-time RL agents often learn to behave in unsafe or unintended ways. This is especially true during the exploration phase, when the agent is trying to learn about its environment. This research acquires safe exploration methods from the field of robotics and evaluates their effectiveness compared to other algorithms that are commonly used in complex videogame environments without safe exploration. We also propose a method for hand-crafting catastrophic states, which are states that are known to be unsafe for the agent to visit. Our results show that our method and our hand-crafted safety constraints outperform state-of-the-art algorithms on relatively certain iterations. This means that our method is able to learn to behave safely while still achieving good performance. These results have implications for the future development of human-level safe learning with combination of model-based RL using complex videogame environments. By developing safe exploration methods, we can help to ensure that RL agents can be used in a variety of real-world applications, such as self-driving cars and robotics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call