Abstract

In electronic marketplaces automated and dynamic pricing is becoming increasingly popular. Agents that perform this task can improve themselves by learning from past observations, possibly using reinforcement learning techniques. several papers studied the use of Q-learning for modeling the problem of dynamic pricing in electronic marketplaces. But The extension of reinforcement learning (RL) to large state space has inevitably encountered the problem of the curse of dimensionality. Improving the learning efficiency of the agent is much more important to the practical application of RL. To address the problem of dynamic pricing, we take a Bayesian model-based approach, framing transition function and reward function of MDP as distributions, and use sampling technique for action selection. The Bayesian approach accounts for the general problem of exploration vs exploitation tradeoff. Simulations show our dynamic pricing algorithm improves the profits compares with other pricing strategies based on the same pricing model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call