Abstract

This paper presents a deep reinforcement learning (DRL)-based task scheduling algorithm that is applied to an FPGA-based real-time digital simulation (FRTDS) system to generate arrangements to minimize the makespan of a task sequence with limited resources. The algorithm has two parts, which are synthetic cost construction and DRL processing to make arrangements. The synthetic cost represents the cost of different selections of arrangements in both resource usage and blockage arranging probability. This study uses the cost to measure the state-action value function to process the deep Q network (DQN) procedure to generate an optimized scheduling strategy. We establish the reinforcement learning strategy generation process by instantiating the computing components in the hardware as agents, and RAM resources and communication I/O ports as environment. A hardware-design-based decision rule is constructed to ensure that the computing variables are distributed as evenly as possible in storage, while making full use of the pipeline characteristics of FPGA. A compiler is written to generate an FRTDS binary stream to drive FRTDS. Accuracy and performance of the proposed method are verified and evaluated. We present simulation results of the modeling method, as well as from a classic method. Comparing these results, the makespan obtained by the proposed method is significantly shorter. It corresponds to the possibility of having higher computing power and dealing with larger-scale real-time simulation.

Highlights

  • Real-time simulation is of great significance to control system design, hardware equipment testing, and staff training

  • A low-cost real time simulation system based on a digital signal processor (DSP) was built for educational purposes [1]. [2] emphasized the importance of the analysis of microgrids with a real time digital simulator (RTDS) and [3] built a co-simulation framework that can assess microgrids with hardware-in-the-loop testing approaches

  • We propose a reinforcement learning (RL)-based algorithm that takes resource usage as parameters to describe the cost of task selection, and whose principle is the balanced storage of variables as an arrangement

Read more

Summary

INTRODUCTION

Real-time simulation is of great significance to control system design, hardware equipment testing, and staff training. We propose a reinforcement learning (RL)-based algorithm that takes resource usage as parameters to describe the cost of task selection, and whose principle is the balanced storage of variables as an arrangement. According to the previous analysis, blocks emerge when conflicts exist in reading and writing addresses or communicating, because the full-state feedback of resources at a certain hardware clock in the future cannot be explicitly obtained. Even if a variable exists in the RAM storage area before the specified time, there are still cases when the RAM does not belong to the private RAM of the current computing component In this case, the current task selection will need communication to arrange subsequent tasks, which will incur additional blocking costs. We add a max epochs limit to stop the iteration in the case of exceptions

REINFORCEMENT LEARNING BASED ON COST
DECISION RULES AND DECISION ORDER OF COMPUTING COMPONENTS
COMPUTING TASK OPTIMIZED SCHEDULING ALGORITHM BASED ON FRTDS
FRTDS EXAMPLE VERIFICATION AND RESULTS ANALYSIS
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.