It is very difficult for manufacturing enterprises to achieve automatic coordination of multiproject and multilevel planning when they are unable to make large-scale resource adjustments. In addition, planning and coordination work mostly relies on human experience, and inaccurate planning often occurs. This article innovatively proposes the PERT-RP-DDPGAO algorithm, which effectively combines the program evaluation and review technique (PERT) and deep deterministic policy gradient (DDPG) technology. Innovatively using matrix computing, the resource plan (RP) itself is used for the first time as an intelligent agent for reinforcement learning, achieving automatic coordination of multilevel plans. Through experiments, this algorithm can achieve automatic planning and has interpretability in management theory. To solve the problem of continuous control, the second half of the new algorithm adopts the DDPG algorithm, which has advantages in convergence and response speed compared to traditional reinforcement learning algorithms and heuristic algorithms. The response time of this algorithm is 3.0% lower than the traditional deep Q-network (DQN) algorithm and more than 8.4% shorter than the heuristic algorithm.
Read full abstract