Abstract

We consider a stylized dynamic pricing model in which a monopolist prices a product to a sequence of T customers who independently make purchasing decisions based on the price offered according to a general parametric choice model. The parameters of the model are unknown to the seller, whose objective is to determine a pricing policy that minimizes the regret, which is the expected difference between the seller's revenue and the revenue of a clairvoyant seller who knows the values of the parameters in advance and always offers the revenue-maximizing price. We show that the regret of the optimal pricing policy in this model is [Formula: see text], by establishing an [Formula: see text] lower bound on the worst-case regret under an arbitrary policy, and presenting a pricing policy based on maximum-likelihood estimation whose regret is [Formula: see text] across all problem instances. Furthermore, we show that when the demand curves satisfy a “well-separated” condition, the T-period regret of the optimal policy is Θ(log T). Numerical experiments show that our policies perform well.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call