Abstract

We study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic-review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives according to a homogeneous Poisson process. We consider both backlogging and lost-sales cases, and our goal is to find a simultaneously ordering and pricing policy to maximize the expected discounted profit over the planning horizon. When there is no fixed ordering cost involved, we design a deep reinforcement learning algorithm to obtain a near-optimal ordering policy and show that there are some monotonicity properties in the learned policy. We also show that our deep reinforcement learning algorithm achieves a better performance than tabular-based Q-learning algorithms. When a fixed ordering cost is involved, we show that our deep reinforcement learning algorithm is effective and efficient, under which the problem of “curse of dimension” is circumvented.

Highlights

  • Complexity it outperforms Q-learning models that do not use neural networks. e outperformance of deep reinforcement learning has been shown by Ke et al [6] and Shihab et al [7] for complex problems

  • When there is no fixed ordering cost involved, we show that the fixed pricing strategy is dominated by the dynamic pricing strategy, under which the price can be adjusted according to the availability of inventory and the lives of remaining items

  • In order to show the expansibility of the proposed algorithm, we extend the distribution of the demand and take an additive form in Chen et al [2] where the customer demand depends on the price of current period plus an additive random term; we obtain a nearoptimal performance by our proposed deep reinforcement learning models

Read more

Summary

Introduction

Complexity it outperforms Q-learning models that do not use neural networks. e outperformance of deep reinforcement learning has been shown by Ke et al [6] and Shihab et al [7] for complex problems. E outperformance of deep reinforcement learning has been shown by Ke et al [6] and Shihab et al [7] for complex problems. We set up deep reinforcement learning models to study the joint pricing and inventory control problem of perishables. We set up a benchmark based on realized demand for this no fixed ordering cost case and show that our designed deep reinforcement learning methods achieve a better performance than tabular-based Q-learning. When the fixed ordering cost is taken into account in the joint pricing and inventory control system, we set up a performance upper bound based on the realized demand in each period in order to assess the performance. Rough our proposed methods, we find convergent policies and critical values under which orders should be placed

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call