Decentralized Zeroth-Order Constrained Stochastic Optimization Algorithms: Frank–Wolfe and Variants With Applications to Black-Box Adversarial Attacks

Anit Kumar Sahu,Soummya Kar

doi:10.1109/jproc.2020.3012609

Anit Kumar Sahu, Soummya Kar

Open Access

https://doi.org/10.1109/jproc.2020.3012609

Copy DOI

Journal: Proceedings of the IEEE	Publication Date: Aug 18, 2020
Citations: 60	License type: publisher-specific, author manuscript

Affiliation: Robert Bosch (United States), Carnegie Mellon University

Abstract

Zeroth-order optimization algorithms are an attractive alternative for stochastic optimization problems, when gradient computations are expensive or when closed-form loss functions are not available. Recently, there has been a surge of activity in utilizing zeroth-order optimization algorithms in myriads of applications including black-box adversarial attacks on machine learning frameworks, reinforcement learning, and simulation-based optimization, to name a few. In addition to utilizing the simplicity of a typical zeroth-order optimization scheme, distributed implementations of zeroth-order schemes so as to exploit data parallelizability are getting significant attention recently. This article presents an overview of recent work in the area of distributed zeroth-order optimization, focusing on constrained optimization settings and algorithms built around the Frank–Wolfe framework. In particular, we review different types of architectures, from master–worker-based decentralized to fully distributed, and describe appropriate zeroth-order projection-free schemes for solving constrained stochastic optimization problems catered to these architectures. We discuss performance issues including convergence rates and dimension dependence. In addition, we also focus on more refined extensions such as by employing variance reduction and describe and quantify convergence rates for a variance-reduced decentralized zeroth-order optimization method inspired by martingale difference sequences. We discuss limitations of zeroth-order optimization frameworks in terms of dimension dependence. Finally, we illustrate the use of distributed zeroth-order algorithms in the context of adversarial attacks on deep learning models.

Full Text