Abstract

This paper presents a methodology that, for the problem of scheduling of a single server on multiple products, finds a dynamic control policy via intelligent agents. The dynamic (state dependent) policy optimizes a cost function based on the WIP inventory, the backorder penalty costs and the setup costs, while meeting the productivity constraints for the products. The methodology uses a simulation optimization technique called Reinforcement Learning (RL) and was tested on a stochastic lot-scheduling problem (SELSP) having a state–action space of size 1.8 × 10 7. The dynamic policies obtained through the RL-based approach outperformed various cyclic policies. The RL approach was implemented via a multi-agent control architecture where a decision agent was assigned to each of the products. A Neural Network based approach (least mean square (LMS) algorithm) was used to approximate the reinforcement value function during the implementation of the RL-based methodology. Finally, the dynamic control policy over the large state space was extracted from the reinforcement values using a commercially available tree classifier tool.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.