Abstract

A multi-agent iterative optimisation method based on deep reinforcement learning is proposed for the balancing and sequencing problem in mixed model assembly lines. Based on the Markov decision process model for balancing and sequencing, a balancing agent using a deep deterministic policy gradient algorithm, a sequencing agent using an Actor–Critic algorithm, as well as an iterative interaction mechanism between these agents' output solutions are designed for realising the global optimisation of mixed model assembly lines. The exchange of solution information including assembly time and station workload in the iterative interaction realises the coordination of the worker assignment policy at the balancing stage and the production arrangement policy at the sequencing stage for the minimisation of work overload and idle time at stations. Through the comparative experiments with heuristic rules, genetic algorithms, and the original deep reinforcement learning algorithm, the effectiveness of the proposed method is demonstrated and discussed for small-scale instances as well as large-scale ones.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call