Abstract
This paper proposes a novel mathematical theory of adaptation to convexity of loss functions based on the definition of the condense-discrete convexity (CDC) method. The developed theory is considered to be of immense value to stochastic settings and is used for developing the well-known stochastic gradient-descent (SGD) method. The successful contribution of change of the convexity definition impacts the exploration of the learning-rate scheduler used in the SGD method and therefore impacts the convergence rate of the solution that is used for measuring the effectiveness of deep networks. In our development of methodology, the convexity method CDC and learning rate are directly related to each other through the difference operator. In addition, we have incorporated the developed theory of adaptation with trigonometric simplex (TS) designs to explore different learning rate schedules for the weight and bias parameters within the network. Experiments confirm that by using the new definition of convexity to explore learning rate schedules, the optimization is more effective in practice and has a strong effect on the training of the deep neural network.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.