Abstract

The paper introduces an approach to apply reinforcement learning (RL) for production scheduling in a two-stage hybrid flow shop (THFS) production system. The Advantage-Actor Critic (A2C) method is used to train multiple agents to minimize the total tardiness and makespan of a production program. The two-stage hybrid flow shop scheduling problem is a NP-hard combinatorial optimization problem that describes a production system with two stages, each consisting of a set of parallel machines. Our concept combines a Discrete-Event Simulation with a pre-implemented RL algorithm using Stable Baselines3. Since similar research often lacks concrete implementation information, the configuration of the OpenAI Gym interface and the agent-environment interaction is presented.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.