Abstract

In recent years, the problem of reinforcement learning has become increasingly complex, and the computational demands with respect to such processes have increased. Accordingly, various methods for effective learning have been proposed. With the help of humans, the learning object can learn more accurately and quickly to maximize the reward. However, the rewards calculated by the system and via human intervention (that make up the learning environment) differ and must be used accordingly. In this paper, we propose a framework for learning the problems of competitive network topologies, wherein the environment dynamically changes agent, by computing the rewards via the system and via human evaluation. The proposed method is adaptively updated with the rewards calculated via human evaluation, making it more stable and reducing the penalty incurred while learning. It also ensures learning accuracy, including rewards generated from complex network topology consisting of multiple agents. The proposed framework contributes to fast training process using multi-agent cooperation. By implementing these methods as software programs, this study performs numerical analysis to demonstrate the effectiveness of the adaptive evaluation framework applied to the competitive network problem depicting the dynamic environmental topology changes proposed herein. As per the numerical experiments, the greater is the human intervention, the better is the learning performance with the proposed framework.

Highlights

  • Reinforcement learning is concerned with the problem of maximizing the rewards of learning objects that need to be effectively controlled within a defined environment

  • The proposed cooperative human–machine evaluation framework is effective in the complex network topologies in the context of human evaluation intervention using reinforcement learning

  • In the process of researching algorithms that solve these problems and lead to good performance, methods of learning are studied by adding accurate and quantitative evaluation through human intervention with expert knowledge and experience in reinforcement learning process. This trend has led to the need for pre-processing or pre-learning to make learning objects resemble human behavior or appearance in reinforcement learning

Read more

Summary

Introduction

Reinforcement learning is concerned with the problem of maximizing the rewards of learning objects that need to be effectively controlled within a defined environment. The more complex the human’s behavior and system configuration are, the more difficult the problem is, and the longer it takes for the learning object to learn. These problems can be solved through additional tasks such as pre-learning or preprocessing; these tasks are not very effective because preprocessing and pre-learning take a long time to complete and corrupt the learning data. To effectively solve complex and difficult reinforcement learning problems, methods for solving problems through intuitive and professional human intervention have been proposed. Existing studies on reinforcement learning focusing on learning problems with human help were concerned with the single model problem of a simple environment, static simple network and single agent.

Background and Literature Review
Method
Methodology of Cooperative Human–Robot Evaluation
System Implementation and Experimental Results
Players’
Implemented
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call