Abstract

With the advent of mass customisation, solving the assembly sequence planning (ASP) problem not only involves a non-convex optimisation problem that is hard to solve but also requires a high-speed response to the changes of assembly resources. This paper proposes a deep reinforcement learning (DRL) approach for the ASP problem, aiming at promoting the response speed by exploiting the reusability and expandability of past decision-making experiences. First, the connector-based ASP problem is described in a matrix manner, and its objective function is set to minimise assembly cost under the precedence constraints. Secondly, an instance generation algorithm is developed for policy training, and a mask algorithm is adopted to screen out impracticable assembly operations in each decision-making step. Then, the Monte Carlo sampling method is used to evaluate the ASP policy. The policy is learned from an actor–criticbased DRL algorithm, which contains two networks, policy network and evaluation network. Next, the network structures are introduced and they are trained by a mini-batch algorithm. Finally, four cases are studied to validate this method, and the results are discussed. It is demonstrated that the proposed method can solve the ASP problem accurately and efficiently in the environment with dynamic resource changes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call