Abstract
We consider an online supervised learning problem, in which both the instances (input vectors) and the comparator (weight vector) are unconstrained. We exploit a natural scale invariance symmetry in our unconstrained setting: the predictions of the optimal comparator are invariant under any linear transformation of the instances. Our goal is to design online algorithms which also enjoy this property, i.e. are scale-invariant. We start with the case of coordinate-wise invariance, in which the individual coordinates (features) can be arbitrarily rescaled. We give an algorithm, which achieves essentially optimal regret bound in this setup, expressed by means of a coordinate-wise scale-invariant norm of the comparator. We then study general invariance with respect to arbitrary linear transformations. We first give a negative result, showing that no algorithm can achieve a meaningful bound in terms of scale-invariant norm of the comparator in the worst case. Next, we compliment this result with a positive one, providing an algorithm which “almost” achieves the desired bound, incurring only a logarithmic overhead in terms of the relative size of the instances.
Highlights
We consider the following variant of online convex optimization (Cesa-Bianchi and Lugosi, 2006; Shalev-Shwartz, 2011; Hazan, 2015)
This means that the predictions of the optimal comparator are invariant under any linear transformation of the instances, so that the scale of the weight vector is only relative to the scale of the instances
We considered unconstrained online convex optimization, exploiting a natural scale invariance symmetry: the predictions of the optimal comparator are invariant under any linear transformation of the instances
Summary
We consider the following variant of online convex optimization (Cesa-Bianchi and Lugosi, 2006; Shalev-Shwartz, 2011; Hazan, 2015). Most of the work in online convex optimization assumes that the instances and the comparator are constrained to some bounded convex sets, often known to the algorithm in advance. We exploit a natural scale invariance symmetry of the unconstrained setting: if we transform all instances by any invertible linear transformation A, x → Ax, and simultaneously transform the comparator by the (transposed) inverse of A, u → A−⊤u, the predictions, and the comparator’s loss will not change. The algorithm is a second-order method and runs in O(d2) time per trial
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.