Gradient Descent Optimization in Deep Learning Model Training Based on Multistage and Method Combination Strategy

Chuanlei Zhang,Wei Chen,Dufeng Chen,Minda Yao,Yuliang Wu,Shanwen Zhang

doi:10.1155/2021/9956773

Abstract

Gradient descent is the core and foundation of neural networks, and gradient descent optimization heuristics have greatly accelerated progress in deep learning. Although these methods are simple and effective, how they work remains unknown. Gradient descent optimization in deep learning has become a hot research topic. Some research efforts have tried to combine multiple methods to assist network training, but these methods seem to be more empirical, without theoretical guides. In this paper, a framework is proposed to illustrate the principle of combining different gradient descent optimization methods by analyzing several adaptive methods and other learning rate methods. Furthermore, inspired by the principle of warmup, CLR, and SGDR, the concept of multistage is introduced into the field of gradient descent optimization, and a gradient descent optimization strategy in deep learning model training based on multistage and method combination strategy is presented. The effectiveness of the proposed strategy is verified on the massive deep learning network training experiments.

Highlights

Today, thanks to the contribution of deep learning and deep neural networks, artificial intelligence (AI) is a thriving field with many practical applications and active research topics
Learning rate decay methods like cosine decay and adaptive learning rate methods like RMSprop [4] and Adam [5] are famous in the practical neural network training process. e methods based on gradient estimation, like Momentum [6] and Nesterov Accelerated Gradient (NAG) [7], are able to facilitate the neural network model training
We choose the most intuitive way to demonstrate it, taking different methods to compare the performance of different adjustment strategies and executing 10 epochs of training for each method. e performances of these are shown in Tables 6 and 7

Summary

Introduction

Thanks to the contribution of deep learning and deep neural networks, artificial intelligence (AI) is a thriving field with many practical applications and active research topics. Gradient descent is the core and foundation of a neural network. Just like the engine of a car, a deep neural network (DNN) is composed of many parts and the core is gradient descent optimization. In the fields of gradient descent optimization, quite a few methods have been proposed to improve the training performance of neural networks. E methods based on gradient estimation, like Momentum [6] and Nesterov Accelerated Gradient (NAG) [7], are able to facilitate the neural network model training. These methods work well, usually they are used alone for neural network model training. Loshchilov and Hutter [8] found that using a learning rate multiplier method can substantially improve Adam performance, and they advocate not to overlook the combining use of learning rate methods for Adam. ese methods can achieve certain improvement effects

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Security and Communication Networks	Publication Date: Jul 22, 2021
Citations: 12	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Gradient Descent Optimization in Deep Learning Model Training Based on Multistage and Method Combination Strategy

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Security and Communication Networks

Lead the way for us

Similar Papers

Medical Image Classification Algorithm Based on Visual Attention Mechanism-MCNN.
Fengping An ... Xingmin Ma
Oxidative Medicine and Cellular Longevity | VOL. 2021
Fengping An, et. al.Fengping An ... Xingmin Ma
01 Jan 2020
Oxidative Medicine and Cellular Longevity | VOL. 2021

Full-scaled deep metric learning for pedestrian re-identification
Wei Huang ... Yufei Zha
Multimedia Tools and Applications | VOL. 80
Wei Huang, et. al.Wei Huang ... Yufei Zha
10 Oct 2020
Multimedia Tools and Applications | VOL. 80

A Generative Neural Network for Maximizing Fitness and Diversity of Synthetic DNA and Protein Sequences.
Johannes Linder ... Georg Seelig
Cell Systems | VOL. 11
Johannes Linder, et. al.Johannes Linder ... Georg Seelig
25 Jun 2020
Cell Systems | VOL. 11

An Efficient Algorithm for Multi Class Classification in Deep Neural Network
Pranamita Nanda ... N Duraipandian
-
Pranamita Nanda, et. al.Pranamita Nanda ... N Duraipandian
12 Oct 2022
12 Oct 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Gradient Descent Optimization in Deep Learning Model Training Based on Multistage and Method Combination Strategy

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Security and Communication Networks