Abstract

This paper presents an approach for solving the crowd navigation problem in an unknown and dynamic environment based on deep reinforcement learning. In our approach, we first make four leader agents learn how to reach their goals and avoid collisions with static and dynamic obstacles in an unknown environment by use of proximal policy optimization combined with Long short-term memory and a collision prediction algorithm. In the second stage, we make each leader agent arrive at a specific goal several times and record its trajectory as the guiding path so that the members in its group know how to reach their goals. We adopt the Reciprocal Velocity Obstacle algorithm to make agents not collide with others. Finally, we simulate the scenario of four groups moving towards their goals simultaneously using the Unity 3D engine. The experimental results demonstrate self-learning ability of a crowd who can reach their goals successfully in an unknown and dynamic environment.

Highlights

  • Crowd simulation has been gaining considerable attention due to its applications in entertainment, education, architecture, training, urban engineering and virtual heritage

  • We adopt the Reciprocal Velocity Obstacle (RVO) algorithm to make agents not collide with others and we simulate the scenario of four groups moving towards their goals simultaneously

  • Our method is inspired by the deep reinforcement learning technique, which solves the problems that the action space is discrete and it is intractable when the state space is high dimensional in Q learning methods, and finds feasible paths to reach their goals for crowds while avoiding collisions with static and dynamic obstacles through giving the reward and the penalty

Read more

Summary

INTRODUCTION

Crowd simulation has been gaining considerable attention due to its applications in entertainment, education, architecture, training, urban engineering and virtual heritage. Path planning and decision making can guarantee that agents reach their goals without colliding with obstacles and other agents in an optimal way and it is a very important aspect of crowd simulation that researchers should put great effort into. We first make four leader agents learn how to reach their goals and avoid the collisions with static and dynamic obstacles in an unknown environment by use of Proximal Policy Optimization (PPO)[35] combined with long short-term memory (LSTM)[34] and a collision prediction algorithm[38]. Agents can learn how to avoid collisions with static obstacles in an unknown environment, even if the positions of static obstacles change, which shows that our agents adopt the self-learning ability of human beings.

RELATED WORKS
POLICY REPRESENTATION
RESULTS
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.