Abstract

We study a class of nonconvex-strongly-concave min-max optimization problems. A most commonly used algorithm for such problems in machine learning applications is the class of first-order algorithms where gradient descent and ascent steps are performed simultaneously or alternatively in each step. Despite its great success in practice, its theoretical properties are far from being understood. In fact, not much has been said about its convergence once the convex-concave assumption is absent. This is considerably different from minimization problems where many techniques are available to analyze nonconvex problems. It is not clear that if these techniques can be applied to min-max optimization. Despite the simplicity of this type of first-order methods, its properties are extremely difficult to analyze due to the nonlinear and nonconvex coupling between the maximization and minimization steps.(p)(/p)In this paper, we take a step toward this direction by examining a special class of nonconvex-strongly-concave min-max problems. We show that, with a proper stepsize choice, a simple alternating gradient descent/ascent (AGDA) algorithm would, in fact, converge to a stationary solution with a sublinear rate $\mathcal{O}\left( {1/t} \right)$, where t is the iteration number. We hope our analysis sheds light on future studies on the theoretical properties of relevant machine learning problems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.