Abstract

Thompson Sampling has become a prominent algorithmic approach in recent years. This review focuses on the evolution of TS and its variants, showing the innovative aspects of Neural Thompson Sampling (NeuralTS) and Meta-Thompson Sampling (Meta-TS), explaining the aggressive strategy used by Feel-Good Thompson Sampling (FGTS) and the introduction to Safe-LTS for Linear Thompson Sampling (LTS) problem. The survey first systematically review the literature, then examine the theoretical underpinnings, algorithmic frameworks and innovations of those TS variants, in the end provide our insights in future directions. In short, NeuralTS handles high-dimensional reward functions through deep learning integration, Meta-TS takes advantage of meta-learning for adapting to unknown prior distributions, FGTS applies an aggressive exploration strategy to handle pessimistic scenarios. In the end, this paper suggests that future research should emphasis on enhancing generalizability, bridging the gap between theory and practice, and improving adaptability to complex and dynamic environments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.