Medical Equipment Supply Chain Optimization and Stability Study using Deep Reinforcement Learning

Zhuoxun Chen

doi:10.54097/hset.v68i.12115

Abstract

Medical equipment is a critical resource during the COVID-19 pandemic. An efficient and stable supply chain of medical equipment (masks, goggles, protective coveralls, etc.) enables medical workers and first responders to effectively and safely fight against this highly contagious disease. In my research, I design and investigate two agents based on the traditional (s,Q) policy and the Deep Reinforcement Learning (DRL) algorithm and apply them respectively to optimize a two-echelon medical equipment supply chain where one distribution center and multiple retailers are involved. To my knowledge, this is the first implementation of DRL algorithm for medical supply chain optimization. I implement the DRL algorithm in Python using Ray and RLlib packages and conduct experiments using Google Colab with GPU support. To maximize the DRL algorithm’s potential, I optimize the reward function and the hyperparameters of this algorithm. By testing the agents in different environment initializations, I find that the DRL algorithm outperforms the static (s, Q) agent, which is one of the most commonly used methods in many inventory optimization systems, by returning a 17.33% greater cumulative reward on average. Additionally, the relative standard deviation of baseline (s,Q) Policy is 1.97% and that of DRL is 2.49% based on ten repeated trials. Thus, the DRL approach is not only stable but can also significantly improve the retailer’s profit. My DRL model can be further applied to more complicated multi-echelon supply chain systems and lays a solid foundation for optimizing large-scale medical supply chains [TF18].

Full Text