Abstract

With the wide application of the shield tunneling method in tunnel engineering, the untimely and incorrect attitude control of shield systems has become an essential factor affecting the quality of shield tunneling. The use of multi-agent deep reinforcement learning (MADRL) algorithm to learn correct and timely shield attitude correction policies is beneficial for handling shield attitude control tasks which are complex and separable. However, the sensing data of the shield systems are complicated and widely distributed, which affects the learning efficiency of the agents. To resolve this problem, this paper proposes a MADRL framework based on state classification and assignment (SCA) for adaptive shield attitude control. SCA-MADRL uses a clustering algorithm to classify states and uses a planning model to make the agent's learning space a consistent state space so that each agent can learn efficiently in its own learning space. The state classification algorithm in the framework can improve the learning efficiency of the agent, and the proposed assignment algorithm can further enhance the learning effect when reusing the agent. This paper verifies the necessity and effectiveness of using task decomposition to implement MADRL for equipment control and provides a practical framework for automated and intelligent shield systems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.