Abstract

Stochastic optimization problems often involve data distributions that change in reaction to the decision variables. This is the case, for example, when members of the population respond to a deployed classifier by manipulating their features so as to improve the likelihood of being positively labeled. Recent works on performative prediction identify an intriguing solution concept for such problems: find the decision that is optimal with respect to the static distribution that the decision induces. Continuing this line of work, we show that, in the strongly convex setting, typical stochastic algorithms—originally designed for static problems—can be applied directly for finding such equilibria with little loss in efficiency. The reason is simple to explain: the main consequence of the distributional shift is that it corrupts algorithms with a bias that decays linearly with the distance to the solution. Using this perspective, we obtain convergence guarantees for popular algorithms, such as stochastic gradient, clipped gradient, prox-point, and dual averaging methods, along with their accelerated and proximal variants. In realistic applications, deployment of a decision rule is often much more expensive than sampling. We show how to modify the aforementioned algorithms so as to maintain their sample efficiency when performing only logarithmically many deployments. Funding: This work was supported by the Division of Computing and Communication Foundations [Grant 1740551] and Division of Mathematical Sciences [Grant 1651851].

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call