Introduction. Nonsmooth optimization problems arise in a wide range of applications, including engineering, finance, and deep learning, where activation functions often have discontinuous derivatives, such as ReLU. Conventional optimization algorithms developed primarily for smooth problems face difficulties when applied to nonsmooth contexts due to discontinuities and other associated irregularities. Possible approaches to overcome these problems include smoothing of functions and applying non-smooth optimization techniques. In particular, Shor's r-algorithm (Shor, Zhurbenko (1971), Shor (1979)) with space stretching in the direction of the difference of two subsequent subgradients is a competitive non-smooth optimization method (Bagirov et al. (2014)). However, the original r-algorithm is designed to minimize unconstrained convex functions. The goal of the work. The standard technique for applying this algorithm to problems with constraints consists in the use of exact non-smooth penalty functions (Eremin (1967), Zangwill (1967)). At the same time, it is necessary to correctly choose (quite large) the penalty coefficient of the penalty functions. Norkin (2020, 2022), Galvan et al. (2021) propose the so-called projective exact penalty functions method, which theoretically does not require choice of the penalty coefficient. The purpose of the present work is to study an applicability of the new exact projective non-smooth penalty functions method for solving conditional problems of non-smooth optimization by Shor's r-algorithm. The results. In this paper, the original optimization problem with convex constraints is first transformed into an unconstrained problem by the projective penalty function method, and then the r-algorithm is used to solve the transformed problem. The results of testing this approach on problems with linear constraints using a program implemented in Matlab are presented. The results of the present study show that the standard method of non-smooth penalties combined with Shor's r-algorithm is fast, due to the use of the provided program to calculate the subgradients, but it requires the correct selection of the penalty parameter. The projective penalty method is slow because in this study it uses finite differences to calculate the gradients, but it is quite stable with respect to the choice of the penalty parameter. Further research will be aimed at investigating the differential properties of the projection mapping and reducing the time of computing subgradients for account of parallel calculations. Keywords: Subgradient descent, constrained optimization, r-algorithm, exact projective penalty.
Read full abstract