The Novel Application of Deep Reinforcement to Solve the Rebalancing Problem of Bicycle Sharing Systems with Spatiotemporal Features

Baoran Pan,Yingdong Pei,Lixin Tian

doi:10.3390/app13179872

Abstract

Facing the Bicycle Rebalancing Problem (BRP), we established a Rebalancing Incentive System (BRIS). In BRIS, the bicycle operator proposes the method of financial compensation to encourage cylclists to detour some specific stations where the number of bikes is excessive or insufficient and access suitable sations. BRIS mainly includes two objects: the Bike Gym imitating the bicycle environment, and the Spatiotemporal Rebalancing Pricing Algorithm (STRPA) determining the amount of money which is given to the cyclist depending on time. STRPA is a deep reinforcement learning model based on the actor–critic structure, which is the core concept of this paper. In STRPA, the hierarchical principle is introduced to solve the dimensional disaster, and the graph matrix A is introduced to solve the complex node relationship. In addition, the traffic data including the bicycle have strong temporal and spatial characteristics. The gated recurrent unit (GRU), the sub-module of STRPA, can extract the temporal characteristics well, and the graph convolution network (GCN), also a sub-module of STRPA, can extract the spatial characteristic. Finally, our model is superior to the baseline model when verified on the the public bicycle data of Nanjing.

Full Text