Abstract
For the problems of online pricing with offline data, or other similar general online learning with offline data problems, there are two classical approaches to measure the performance of an online policy: the worst-case approach and the Bayesian approach. We argue that the worst-case approach may not be able to appropriately handle offline data, particularly in evaluating the impact of offline data on online policy performance. We construct counterexamples to support our argument on the worst-case approach. Alternatively, we show that the Bayesian approach may be preferred because of its capacity in appropriately measuring the impact of offline data, as well as other properties regarding finding and analyzing optimal policies.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.