Abstract
Deep reinforcement learning develops rapidly by using neural network to approximate the learning data of reinforcement learning, which makes the sequential decision in continuous space making achieve preliminary results. However, deep reinforcement learning is over-dependent on huge amount of training and requires accurate reward. For many problems in the real world, such as robot learning, there is generally no good reward and no unlimited training, which requires the ability to learn quickly. In this paper, we propose a deep reinforcement learning model with meta-learning, which we call meta Q-network (MQN). The model uses a LSTM-based meta-learner to update the Q-network. This optimises a series of problems such as the difficulty in stability of Q-network in deep reinforcement learning model, and we have proved this improved performance through experiments. It is not optimal, though, still the combination of meta-learning and reinforcement learning is very desirable.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have