The fingerprint localization based Channel State Information (CSI) plays a vital role given the popularity of Location-Based Service. Since its easy implementation, low device cost and CSI provides fine-grained information which can achieve adequate accuracy. However, the main drawback is that the approach has to construct the fingerprint map manually during the off-line stage, which is tedious and time-consuming. In this paper, we propose a novel data collection strategy for path planning based on reinforcement learning, namely Asynchronous Advantage Actor-Critic (A3C). Given the limited exploration step length, it needs to maximize the informative CSI data for reducing manual cost. We collect a small amount of real data in advance to predict the rewards of all sampling points by multivariate Gaussian process and mutual information. Then the optimization problem is transformed to a sequential decision process, which can exploit the informative path by A3C. We complete the proposed algorithm in two real-world dynamic environments and extensive experiments verify its performance. Compared to coverage path planning, our system not only can achieve similar indoor localization accuracy, but also reduce the CSI collection task.
Read full abstract