Abstract

We consider a service provider offering a subscription service to customers over a multi-period planning horizon. The customers decide whether to subscribe according to a utility model that represents their preferences for the service. The provider has a prior belief about the customer utility model, and updates its belief based on the transaction data of new customers and the usage data of existing subscribers. The provider aims to minimize its regret, namely the expected profit loss relative to a clairvoyant who knows the customer utility model. To analyze regret, we first study the clairvoyant's full-information problem. The resulting dynamic program, however, suffers from the curse of dimensionality. We develop a customer-centric approach to resolve this issue and obtain the optimal policy for the full-information problem. This approach balances the provider's immediate and future profits from an individual customer. When the provider does not have full information, we find that the simple and commonly used certainty-equivalence policy, which learns only passively, exhibits poor performance. We illustrate that this can be due to incomplete or slow learning, but can also occur because of offering a suboptimal contract with a long subscription period at the beginning. We propose a two-phase learning policy that first focuses on information accumulation and then profit maximization. We show that our policy achieves asymptotically optimal performance with its regret growing logarithmically in the planning horizon. Our results indicate that the provider should be cautious about offering a long subscription period when it is uncertain about customer preferences.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call