Orbit correction based on improved reinforcement learning algorithm

Xiaolong Chen,Yongzhi Jia,Yuan He,Xin Qi,Zhijun Wang

doi:10.1103/physrevaccelbeams.26.044601

Abstract

Recently, reinforcement learning (RL) algorithms have been applied to a wide range of control problems in accelerator commissioning. In order to achieve efficient and fast control, these algorithms need to be highly efficient, so as to minimize the online training time. In this paper, we incorporated the beam position monitor trend into the observation space of the twin delayed deep deterministic policy gradient (TD3) algorithm and trained two different structure agents, one based on physical prior knowledge and the other using the original TD3 network architecture. Both of the agents exhibit strong robustness in the simulated environment. The effectiveness of the agent based on physical prior knowledge has been validated in a real accelerator. Results show that the agent can overcome the difference between simulated and real accelerator environments. Once the training is completed in the simulated environment, the agent can be directly applied to the real accelerator without any online training process. The RL agent is deployed to the medium energy beam transport section of China Accelerator Facility for Superheavy Elements. Fast and automatic orbit correction is being tested with up to ten degrees of freedom. The experimental results show that the agents can correct the orbit to within 1 mm. Moreover, due to the strong robustness of the agent, when a trained agent is applied to different lattices of different particles, the orbit correction can still be completed. Since there are no online data collection and training processes, all online corrections are done within 30 s. This paper shows that, as long as the robustness of the RL algorithm is sufficient, the offline learning agents can be directly applied to online correction, which will greatly improve the efficiency of orbit correction. Such an approach to RL may find promising applications in other areas of accelerator commissioning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Physical Review Accelerators and Beams	Publication Date: Apr 13, 2023
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Orbit correction based on improved reinforcement learning algorithm

Abstract

Talk to us

Similar Papers

More From: Physical Review Accelerators and Beams

Lead the way for us

Similar Papers

UAV maneuvering decision -making algorithm based on Twin Delayed Deep Deterministic Policy Gradient Algorithm
Shuangxia Bai ... Evgeny Neretin
Journal of Artificial Intelligence and Technology | VOL. -
Shuangxia Bai, et. al.Shuangxia Bai ... Evgeny Neretin
07 Dec 2021
Journal of Artificial Intelligence and Technology | VOL. -

A State-Compensated Deep Deterministic Policy Gradient Algorithm for UAV Trajectory Tracking
Jiying Wu ... Luwei Liao
Machines | VOL. 10
Jiying Wu, et. al.Jiying Wu ... Luwei Liao
21 Jun 2022
Machines | VOL. 10

Stability Analysis for Autonomous Vehicle Navigation Trained over Deep Deterministic Policy Gradient
Mireya Cabezas-Olivenza ... Ekaitz Zulueta
Mathematics | VOL. 11
Mireya Cabezas-Olivenza, et. al.Mireya Cabezas-Olivenza ... Ekaitz Zulueta
27 Dec 2022
Mathematics | VOL. 11

A DDPG Algorithm Based Reinforcement Learning Controller for Three-Phase DC-AC Inverters
Jian Ye ... Xinan Zhang
-
Jian Ye, et. al.Jian Ye ... Xinan Zhang
24 Feb 2023
24 Feb 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Orbit correction based on improved reinforcement learning algorithm

Abstract

Talk to us

Similar Papers

More From: Physical Review Accelerators and Beams