Abstract

Reinforcement Learning (RL) has found widespread applications in various decision making tasks. However, it still faces challenges such as the deadly triad, slow convergence, and rewards drop, which limit its practical scope. This paper primarily addresses the issues of slow convergence and rewards drop observed during RL training. Our proposed solution involves a reward shaping method composed of two distinct components, each serving a specific purpose: speeding up training and enhancing stability. These two components are interconnected through hyper-parameters, and it has been observed that the choice of hyper-parameters plays a critical role in determining the final performance of the RL algorithm. To optimize these hyper-parameters effectively, we employ a discrete sampling technique to cover the value ranges of these parameters. This discrete sampling creates a sparse set of data points within the reward matrix. Subsequently, we introduce a fitting approach based on the Expectation Maximization (EM) algorithm to estimate the global maximum of the reward matrix along with the corresponding hyper-parameters combination. This EM method significantly reduces computational complexity. Our extensive experimental results across various RL environments have demonstrated the effectiveness of our proposed method. It successfully mitigates the issue of rewards drop while simultaneously accelerating the convergence speed of the RL algorithm.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.