Abstract

PurposeMany practical control problems require achieving multiple objectives, and these objectives often conflict with each other. The existing multi-objective evolutionary reinforcement learning algorithms cannot achieve good search results when solving such problems. It is necessary to design a new multi-objective evolutionary reinforcement learning algorithm with a stronger searchability.Design/methodology/approachThe multi-objective reinforcement learning algorithm proposed in this paper is based on the evolutionary computation framework. In each generation, this study uses the long-short-term selection method to select parent policies. The long-term selection is based on the improvement of policy along the predefined optimization direction in the previous generation. The short-term selection uses a prediction model to predict the optimization direction that may have the greatest improvement on overall population performance. In the evolutionary stage, the penalty-based nonlinear scalarization method is used to scalarize the multi-dimensional advantage functions, and the nonlinear multi-objective policy gradient is designed to optimize the parent policies along the predefined directions.FindingsThe penalty-based nonlinear scalarization method can force policies to improve along the predefined optimization directions. The long-short-term optimization method can alleviate the exploration-exploitation problem, enabling the algorithm to explore unknown regions while ensuring that potential policies are fully optimized. The combination of these designs can effectively improve the performance of the final population.Originality/valueA multi-objective evolutionary reinforcement learning algorithm with stronger searchability has been proposed. This algorithm can find a Pareto policy set with better convergence, diversity and density.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.