Abstract

With the advent of mass customisation, solving the assembly sequence planning (ASP) problem not only involves a non-convex optimisation problem that is hard to solve but also requires a high-speed response to the changes of assembly resources. This paper proposes a deep reinforcement learning (DRL) approach for the ASP problem, aiming at promoting the response speed by exploiting the reusability and expandability of past decision-making experiences. First, the connector-based ASP problem is described in a matrix manner, and its objective function is set to minimise assembly cost under the precedence constraints. Secondly, an instance generation algorithm is developed for policy training, and a mask algorithm is adopted to screen out impracticable assembly operations in each decision-making step. Then, the Monte Carlo sampling method is used to evaluate the ASP policy. The policy is learned from an actor–criticbased DRL algorithm, which contains two networks, policy network and evaluation network. Next, the network structures are introduced and they are trained by a mini-batch algorithm. Finally, four cases are studied to validate this method, and the results are discussed. It is demonstrated that the proposed method can solve the ASP problem accurately and efficiently in the environment with dynamic resource changes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.