Abstract

The process and measurement noise covariance matrices significantly impact the Extended Kalman Filter (EKF) performance and are often hand-tuned in practice, which usually entails a tedious task. Q-learning, a well-known method in reinforcement learning, has been applied recently to better adapt the noise covariance matrices for the EKF, thanks to its simplicity and capability in handling uncertain environments. Typically, some heuristics are involved in designing the Q-learning-based EKF (QLEKF), such as tuning grid size and covariance matrices values of each state, which inevitably degrades the estimation performance when the heuristics are not suitable. We propose a dynamic grid-based Q-learning EKF (DG-QLEKF) to overcome that drawback, which brings two novelties, an updated ϵ-greedy algorithm and a dynamic grid strategy. The proposed algorithm and strategy can thoroughly exploit arbitrary search scope and find appropriate values of noise covariance matrices. The effectiveness of DGQLEKF, applied in navigation for attitude and bias estimation, is validated through the Monte Carlo method and real flight data from an unmanned aerial vehicle. The DG-QLEKF leads to much more improved state estimation than the QLEKF and traditional EKF.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call