We consider a periodical equilibrium pricing problem for multiple firms over a planning horizon of [Formula: see text] periods. At each period, firms set their selling prices and receive stochastic demand from consumers. Firms do not know their underlying demand curve, but they wish to determine the selling prices to maximize total revenue under competition. Hence, they have to do some price experiments such that the observed demand data are informative to make price decisions. However, uncoordinated price updating can render the demand information gathered by price experimentation less informative or inaccurate. We design a nonparametric learning algorithm to facilitate coordinated dynamic pricing, in which competitive firms estimate their demand functions based on observations and adjust their pricing strategies in a prescribed manner. We show that the pricing decisions, determined by estimated demand functions, converge to underlying equilibrium as time progresses. We obtain a bound of the revenue difference that has an order of [Formula: see text] and a regret bound that has an order of [Formula: see text] with respect to the number of the competitive firms [Formula: see text] and [Formula: see text]. We also develop a modified algorithm to handle the situation where some firms may have the knowledge of the demand curve.
Read full abstract