Abstract
Most solutions to the inventory management problem assume a centralization of information that is incompatible with organizational constraints in supply chain networks. The problem can be naturally decomposed into sub-problems, each associated with an independent entity, turning it into a multi-agent system. A decentralized solution to inventory management using multi-agent reinforcement learning (MARL) is proposed where each entity is controlled by an agent. Three multi-agent variations of the proximal policy optimization algorithm are investigated through simulations of different supply chain networks and levels of uncertainty. A framework is deployed, which relies on offline centralization during simulation-based policy identification but enables decentralization when the policies are deployed online to the real system. Results show that reducing information sharing constraints in training enables MARL to perform comparatively to a centralized learning-based solution when deployed, and to outperform a distributed model-based solution in most cases, whilst respecting the information constraints of the system.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.