Traffic Navigation for Urban Air Mobility with Reinforcement Learning

Hyochoong Bang,Junyoung Noh,Jaeho Lee,Hohyeong Lee

doi:10.1007/978-981-19-2635-8_3

Abstract

AbstractAssuring stability of the guidance law for quadrotor-type Urban Air Mobility (UAM) is important since it is assumed to operate in urban areas. Model free reinforcement learning was intensively applied for this purpose in recent studies. In reinforcement learning, the environment is an important part of training. Usually, a Proximal Policy Optimization (PPO) algorithm is used widely for reinforcement learning of quadrotors. However, PPO algorithms for quadrotors tend to fail to guarantee the stability of the guidance law in the environment as the search space increases. In this work, we show the improvements of stability in a multi-agent quadrotor-type UAM environment by applying the Soft Actor-Critic (SAC) reinforcement learning algorithm. The simulations were performed in Unity. Our results achieved three times better reward in the Urban Air Mobility environment than when trained with the PPO algorithm and our approach also shows faster training time than the PPO algorithm.KeywordsDeep reinforcement learningMulti-agent systemTraffic networkUrban Air Mobility (UAM)Proximal Policy Optimization (PPO)Soft Actor-Critic (SAC)

Full Text