Abstract

There are many proposed policy-improving system of Reinforcement Learning (RL) agents that effective in quickly adapting to environmental change by using many statistical methods, such as using a Mixture Model of Bayesian network, using Mixture Probability and Clustering Distribution, etc. However, by using a mixture model of Bayesian network, this system increase the computational complexity that make the control of the computational complexity becomes a necessary problem. On the other hand, by using mixture probability and clustering distribution, even though the computational complexity can be controlled and simultaneously maintain the system's performance, the examination of computational complexity load and the adaptation performance to more complex environments such as 3D-environments are required. In this paper, we concentrate on the policy-improving system by using mixture probability and clustering distributions. We introduce new parameters and the modified reward process for experiments on 3D-environments, and then investigate and discuss the performance of our proposed system from the results.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call