Abstract

AbstractWe consider dynamic assortment optimization with incomplete information under the uncapacitated multinomial logit choice model. We propose an anytime stochastic approximation policy and prove that the regret—the cumulative expected revenue loss caused by offering suboptimal assortments—after time periods is bounded by times a constant that is independent of the number of products. In addition, we prove a matching lower bound on the regret for any policy that is valid for arbitrary model parameters—slightly generalizing a recent regret lower bound derived for specific revenue parameters. Numerical illustrations suggest that our policy outperforms alternatives by a significant margin when and the number of products are not too small.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call