Abstract
We explore whether quantum advantages can be found for the zeroth-order online convex optimization (OCO) problem, which is also known as bandit convex optimization with multi-point feedback. In this setting, given access to zeroth-order oracles (that is, the loss function is accessed as a black box that returns the function value for any queried input), a player attempts to minimize a sequence of adversarially generated convex loss functions. This procedure can be described as a T round iterative game between the player and the adversary. In this paper, we present quantum algorithms for the problem and show for the first time that potential quantum advantages are possible for problems of OCO. Specifically, our contributions are as follows. (i) When the player is allowed to query zeroth-order oracles O(1) times in each round as feedback, we give a quantum algorithm that achieves regret without additional dependence of the dimension n, which outperforms the already known optimal classical algorithm only achieving regret. Note that the regret of our quantum algorithm has achieved the lower bound of classical first-order methods. (ii) We show that for strongly convex loss functions, the quantum algorithm can achieve O(log T) regret with O(1) queries as well, which means that the quantum algorithm can achieve the same regret bound as the classical algorithms in the full information setting.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.