An Automated Deep Reinforcement Learning Pipeline for Dynamic Pricing

Reza Refaei Afshar,Uzay Kaymak,Jason Rhuggenaath,Yingqian Zhang

doi:10.1109/tai.2022.3186292

Reza Refaei Afshar, Uzay Kaymak + Show 2 more

Open Access

PDF Available

https://doi.org/10.1109/tai.2022.3186292

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Dynamic pricing problem is difficult due to the highly dynamic environment and unknown demand distributions. In this paper, we propose a Deep Reinforcement Learning (DRL) framework, which is a pipeline that automatically defines the DRL components for solving a Dynamic Pricing problem. The automated DRL pipeline is necessary because the DRL framework can be designed in numerous ways, and manually finding optimal configurations is tedious. The levels of automation make non-experts capable of using DRL for dynamic pricing. Our DRL pipeline contains three steps of DRL design, including MDP modeling, algorithm selection, and hyper-parameter optimization. It starts with transforming available information to state representation and defining reward function using a reward shaping approach. Then, the hyper-parameters are tuned using a novel hyper-parameters optimization method that integrates Bayesian Optimization and the selection operator of the Genetic algorithm. We employ our DRL pipeline on reserve price optimization problems in online advertising as a case study. We show that using the DRL configuration obtained by our DRL pipeline, a pricing policy is obtained whose revenue is significantly higher than the benchmark methods. The evaluation is performed by developing a simulation for the RTB environment that makes exploration possible for the RL agent.

Full Text