Learning Rate Decay Research Articles

Overview

81 Articles

Published in last 50 years

Related Topics

Adaptive Learning Rate

Articles published on Learning Rate Decay

81 Search results

Target-Network Update Linked with Learning Rate Decay Based on Mutual Information and Reward in Deep Reinforcement Learning

In this study, a target-network update of deep reinforcement learning (DRL) based on mutual information (MI) and rewards is proposed. In DRL, updating the target network from the Q network was used to reduce training diversity and contribute to the stability of learning. If it is not properly updated, the overall update rate is reduced to mitigate this problem. Simply slowing down is not recommended because it reduces the speed of the decaying learning rate. Some studies have been conducted to improve the issues with the t-soft update based on the Student’s-t distribution or a method that does not use the target-network. However, there are certain situations in which using the Student’s-t distribution might fail or force it to use more hyperparameters. A few studies have used MI in deep neural networks to improve the decaying learning rate and directly update the target-network by replaying experiences. Therefore, in this study, the MI and reward provided in the experience replay of DRL are combined to improve both the decaying learning rate and the target-network updating. Utilizing rewards is appropriate for use in environments with intrinsic symmetry. It has been confirmed in various OpenAI gymnasiums that stable learning is possible while maintaining an improvement in the decaying learning rate.

Symmetry

Sep 28, 2023
Chayoung Kim

Sparrow Search Algorithm based BGRNN Model for Animal Healthcare Monitoring in Smart IoT

Rural regions rely heavily on agriculture for their economic survival. Therefore, it is crucial for farmers to implement effective and technical solutions to raise production, lessen the impact of issues associated to animal husbandry, and improve agricultural yields. Because of technological developments in computers and data storage, huge volumes of information are now available. The difficulty of extracting useful information from this mountain of data has prompted the development of novel approaches and tools, such as data mining, that can help close the informational gap. To evaluate data mining methods and put them to use in the Animal database to create meaningful connections was the goal of the suggested system. The study's primary objective was to develop an IoT-based Integrated Animal Health Care System. Various sensors were used as the research tool to collect physical and environmental data on the animals and their habitats. Temperature, heart rate, and air quality readings were the types of information collected. This research contributes to the field of health monitoring by introducing an Optimised Bidirectional Gated Recurrent Neural Network approach. The BiGRNN is an improved form of the Gated Recurrent Unit (GRU) in which input is sent both forward and backward through a network and the resulting outputs are connected to the same output layer. Since the BiGRNN method employs a number of hyper-parameters, it is optimised by means of the Sparrow Search Algorithm (SSA). The originality of the study is demonstrated by the development of an SSA technique for hyperparameter optimisation of the BiGRNN, with a focus on health forecasting. Hyperparameters like momentum, learning rate, and weight decay may all be adjusted with the SSA method. In conclusion, the results demonstrate that the suggested tactic is more effective than the current methods.

International Journal on Recent and Innovation Trends in Computing and Communication

Sep 1, 2023
V Gokula Krishnan + 4

Using adaptive learning rate to generate adversarial images

Abstract Convolutional neural networks (CNNs) have proved their efficiency in performing image classification tasks, as they can automatically extract the image features and make the corresponding prediction. Meanwhile, the CNNs application is highly challenged by their vulnerability to adversarial samples. These samples are slightly different from the legitimate samples, but the CNN gives wrong classification. There are various ways to find the adversarial samples. The most common method is using backpropagation to generate gradients as the directed perturbation. Contrarily to set a constrained limitation, in this paper, we use iterative fast gradient sign method to generate adversarial images with the minimum perturbation. The CNNs were trained to perform surgical tool recognition as a configuration for the modern operation room. The coefficient or the learning rate which influenced the modification per iteration, was set to be adaptive instead of a fixed number. A few functions were utilized to perform the learning rate decay to compare the performance. Especially, we propose a new adaptive learning rate algorithm that consider the loss as a part of influence factor constitute the learning rate for the rest iterations. According to the experiments, our loss adaptive learning rate method was proved to be efficient to get the minimal perturbations for adversarial attack.

Current Directions in Biomedical Engineering

Sep 1, 2023
Ning Ding + 1

TLI-YOLOv5: A Lightweight Object Detection Framework for Transmission Line Inspection by Unmanned Aerial Vehicle

Unmanned aerial vehicles (UAVs) have become an important tool for transmission line inspection, and the inspection images taken by UAVs often contain complex backgrounds and many types of targets, which poses many challenges to object detection algorithms. In this paper, we propose a lightweight object detection framework, TLI-YOLOv5, for transmission line inspection tasks. Firstly, we incorporate the parameter-free attention module SimAM into the YOLOv5 network. This integration enhances the network’s feature extraction capabilities, without introducing additional parameters. Secondly, we introduce the Wise-IoU (WIoU) loss function to evaluate the quality of anchor boxes and allocate various gradient gains to them, aiming to improve network performance and generalization capabilities. Furthermore, we employ transfer learning and cosine learning rate decay to further enhance the model’s performance. The experimental evaluations performed on our UAV transmission line inspection dataset reveal that, in comparison to the original YOLOv5n, TLI-YOLOv5 increases precision by 0.40%, recall by 4.01%, F1 score by 1.69%, mean average precision at 50% IoU (mAP50) by 2.91%, and mean average precision from 50% to 95% IoU (mAP50-95) by 0.74%, while maintaining a recognition speed of 76.1 frames per second and model size of only 4.15 MB, exhibiting attributes such as small size, high speed, and ease of deployment. With these advantages, TLI-YOLOv5 proves more adept at meeting the requirements of modern, large-scale transmission line inspection operations, providing a reliable, efficient solution for such demanding tasks.

Electronics

Aug 4, 2023
Hanqiang Huang + 6

Semantic segmentation using Firefly Algorithm-based evolving ensemble deep neural networks

Automatic segmentation of salient objects in real-world images has gained increasing interests owing to its popularity in diverse real-world applications, such as autonomous driving, medical diagnosis, aviation security, and underwater surveillance. In this research, we propose Firefly Algorithm (FA)-enhanced evolving ensemble deep networks for semantic segmentation and visual saliency prediction. An improved FA model is proposed to optimize network hyper-parameters. Specifically, it employs mutation operators and a neighbouring search strategy with granular search steps to establish search intensification. It also emphasizes search diversification by adopting multiple dynamic hybrid leaders and diverse adaptive sine and cosine search trajectories in full and randomly selected sub-dimensions to overcome stagnation. Because of its competent segmentation performance, DeepLabV3+ is fine-tuned using transfer learning with FA-based hyper-parameter identification. We optimize the learning rate, momentum and weight decay of the transfer learning network. A number of optimized DeepLabV3+ networks with distinguishing learning configurations are yielded. An ensemble model is subsequently constructed by incorporating three optimized base networks to further strengthen segmentation performance. Evaluated using diverse challenging semantic segmentation and saliency prediction tasks using underwater and medical image data sets, our evolving ensemble deep network illustrates significant superiority over other state-of-the-art deep networks and existing studies. The proposed FA model also outperforms other search methods in solving diverse mathematical landscapes with statistical significance.

Knowledge-Based Systems

Jul 25, 2023
Li Zhang + 6

A lightweight model for efficient identification of plant diseases and pests based on deep learning.

Plant diseases and pests have always been major contributors to losses that occur in agriculture. Currently, the use of deep learning-based convolutional neural network models allows for the accurate identification of different types of plant diseases and pests. To enable more efficient identification of plant diseases and pests, we design a novel network architecture called Dise-Efficient based on the EfficientNetV2 model. Our experiments demonstrate that training this model using a dynamic learning rate decay strategy can improve the accuracy of plant disease and pest identification. Furthermore, to improve the model's generalization ability, transfer learning is incorporated into the training process. Experimental results indicate that the Dise-Efficient model boasts a compact size of 13.3 MB. After being trained using the dynamic learning rate decay strategy, the model achieves an accuracy of 99.80% on the Plant Village plant disease and pest dataset. Moreover, through transfer learning on the IP102 dataset, which represents real-world environmental conditions, the Dise-Efficient model achieves a recognition accuracy of 64.40% for plant disease and pest identification. In light of these results, the proposed Dise-Efficient model holds great potential as a valuable reference for the deployment of automatic plant disease and pest identification applications on mobile and embedded devices in the future.

Frontiers in Plant Science

Jul 14, 2023
Hongliang Guan + 5

The Study of Performance for Face Detection Based on Multiple Representative Convolutional Neural Networks

Due to the diversity of deep learning models, choosing the suitable model for a specific task can be rather onerous. In this paper, the performance of three deep convolutional neural networks, namely VGG16, ResNet50, and MobileNetV2 on face detection were compared. Each model was trained on a dataset of 11,900 images from the FDDB dataset that included various face sizes and orientations with multiple augmentations, including color alteration, blurring, and flipping. The final layers of the models were modified into a binary classification model and a regression model indicating face found and coordinates of the facial bounding box. The models were trained on the same basis of 40 epochs with batch size 64 with binary cross entropy loss and DIoU loss and a learning rate of 0.0001 with a learning rate decay of 0.8 per epoch. The experimental results demonstrated that VGG16 outperformed ResNet50 and MobileNetV2 in terms of accuracy, with VGG16 achieving the highest score of 0.9240, followed by ResNet50 with a score of 0.8568, and MobileNetV2 with an accuracy of 0.6028. The results suggest that VGG16 is a more suitable choice for face detection applications than ResNet50 and MobileNetV2, while ResNet50 and MobileNetV2 may provide higher accuracy for other image recognition tasks or real time face detections. The findings in this paper can contribute to the selection of appropriate deep learning models for face detection.

Highlights in Science, Engineering and Technology

Jul 11, 2023
Ming Him Foun

PT-symmetric solitons and parameter discovery in self-defocusing saturable nonlinear Schrödinger equation via LrD-PINN.

We propose a physical information neural network with learning rate decay strategy (LrD-PINN) to predict the dynamics of symmetric, asymmetric, and antisymmetric solitons of the self-defocusing saturable nonlinear Schrödinger equation with the PT-symmetric potential and boost the predicted evolutionary distance by an order of magnitude. Taking symmetric solitons as an example, we explore the advantages of the learning rate decay strategy, analyze the anti-interference performance of the model, and optimize the network structure. In addition, the coefficients of the saturable nonlinearity strength and the modulation strength in the PT-symmetric potential are reconstructed from the dataset of symmetric soliton solutions. The application of more advanced machine learning techniques in the field of nonlinear optics can provide more powerful tools and richer ideas for the study of optical soliton dynamics.

Chaos: An Interdisciplinary Journal of Nonlinear Science

Jul 1, 2023
Bo-Wei Zhu + 5

Autonomous face mask detection using single shot multibox detector, and ResNet-50 with identity retrieval through face matching using deep siamese neural network.

The COVID-19 pandemic poses a global health challenge. The World Health Organization states that face masks are proven to be effective, especially in public areas. Real-time monitoring of face masks is challenging and exhaustive for humans. To reduce human effort and to provide an enforcement mechanism, an autonomous system has been proposed to detect non-masked people and retrieve their identity using computer vision. The proposed method introduces a novel and efficient method that involves fine-tuning the pre-trained ResNet-50 model with a new head layer for classification between masked and non-masked people. The classifier is trained using adaptive momentum optimization algorithm with decaying learning rate and binary cross-entropy loss. Data augmentation and dropout regularization are employed to achieve best convergence. During real-time application of our classifier on videos, a Caffe face detector model based on Single Shot MultiBox Detector is used to extract the face regions of interest from each frame, on which the trained classifier is applied for detecting the non-masked people. The faces of these people are then captured, which is passed on to a deep siamese neural network, based on VGG-Face model for face matching. The captured faces are compared with the reference images from the database, by extracting the features and calculating cosine distance. If the faces match, the details of that person are retrieved from the database and displayed on the web application. The proposed method has secured best results where the trained classifier has achieved 99.74% accuracy, and the identity retrieval model achieved 98.24% accuracy.

Journal of ambient intelligence and humanized computing

Jun 7, 2023
S Vignesh Baalaji + 5

Mining belt foreign body detection method based on YOLOv4_GECA model

In the process of mining belt transportation, various foreign objects may appear, which will have a great impact on the crusher and belt, thus affecting production progress and causing serious safety accidents. Therefore, it is important to detect foreign objects in the early stages of intrusion in mining belt conveyor systems. To solve this problem, the YOLOv4_GECA method is proposed in this paper. Firstly, the GECA attention module is added to establish the YOLOv4_GECA foreign object detection model in the mineral belt to enhance the foreign object feature extraction capability. Secondly, based on this model, the learning rate decay of restart cosine annealing is used to improve the foreign object image detection performance of the model. Finally, we collected belt transport image information from the Pai Shan Lou gold mine site in Shenyang and established a belt foreign body detection dataset. The experimental results show that the average detection accuracy of the YOLOv4_GECA method proposed in this paper is 90.1%, the recall rate is 90.7%, and the average detection time is 30 ms, which meets the requirements for detection accuracy and real-time performance at the mine belt transportation site.

Scientific Reports

Jun 1, 2023
Dong Xiao + 4

Design of optimal Elman Recurrent Neural Network based prediction approach for biofuel production

Renewable sources like biofuels have gained significant attention to meet the rising demands of energy supply. Biofuels find useful in several domains of energy generation such as electricity, power, or transportation. Due to the environmental benefits of biofuel, it has gained significant attention in the automotive fuel market. Since the handiness of biofuels become essential, effective models are required to handle and predict the biofuel production in realtime. Deep learning techniques have become a significant technique to model and optimize bioprocesses. In this view, this study designs a new optimal Elman Recurrent Neural Network (OERNN) based prediction model for biofuel prediction, called OERNN-BPP. The OERNN-BPP technique pre-processes the raw data by the use of empirical mode decomposition and fine to coarse reconstruction model. In addition, ERNN model is applied to predict the productivity of biofuel. In order to improve the predictive performance of the ERNN model, a hyperparameter optimization process takes place using political optimizer (PO). The PO is used to optimally select the hyper parameters of the ERNN such as learning rate, batch size, momentum, and weight decay. On the benchmark dataset, a sizable number of simulations are run, and the outcomes are examined from several angles. The simulation results demonstrated the suggested model's advantage over more current methods for estimating the output of biofuels.

Scientific Reports

May 26, 2023
N Paramesh Kumar + 3

A Transfer Residual Neural Network Based on ResNet-50 for Detection of Steel Surface Defects

With the increasing popularity of deep learning, enterprises are replacing traditional inefficient and non-robust defect detection methods with intelligent recognition technology. This paper utilizes TL (transfer learning) to enhance the model’s recognition performance by integrating the Adam optimizer and a learning rate decay strategy. By comparing the TL-ResNet50 model with other classic CNN models (ResNet50, VGG19, and AlexNet), the superiority of the model used in this paper was fully demonstrated. To address the current lack of understanding regarding the internal mechanisms of CNN models, we employed an interpretable algorithm to analyze pre-trained models and visualize the learned semantic features of defects across various models. This further confirms the efficacy and reliability of CNN models in accurately recognizing different types of defects. Results showed that the TL-ResNet50 model achieved an overall accuracy of 99.4% on the testing set and demonstrated good identification ability for defect features.

Applied Sciences

Apr 23, 2023
Luying Zhang + 3

AdaD-FNN for Chest CT-Based COVID-19 Diagnosis

Coronavirus disease 2019 (COVID-19) generated a global public health emergency since December 2019, causing huge economic losses. To help radiologists strengthen their recognition of COVID-19 cases, we developed a computer-aided diagnosis system based on deep learning to automatically classify chest computed tomography-based COVID-19, Tuberculosis, and healthy control subjects. Our novel classification model AdaD-FNN sequentially transfers the trained knowledge of an FNN estimator to the next FNN estimator while updating the weights of the samples in the training set with a decaying learning rate. This model inhibits the network from remembering the noisy information and improves the learning of complex patterns in the hard-to-identify samples. Moreover, we designed a novel image preprocessing model F-U2MNet-C by enhancing the image features using fuzzy stacking and eliminating the interference factors using U2MNet segmentation. Extensive experiments are conducted on four publicly available datasets namely, TLDCA, UCSD-Al4H, SARS-CoV-2, TCIA, and the obtained classification accuracies are 99.52%, 92.96%, 97.86%, 91.97%. Our novel system gives out compelling performance for assisting COVID-19 detection when compared with 22 state-of-the-art methods. We hope to help link together biomedical research and artificial intelligence and to assist the diagnosis of doctors, radiologists, and inspectors at each epidemic prevention site in the real world.

IEEE Transactions on Emerging Topics in Computational Intelligence

Feb 1, 2023
Xujing Yao + 5

Effects of neuromodulation-inspired mechanisms on the performance of deep neural networks in a spatial learning task

In recent years, the biological underpinnings of adaptive learning have been modeled, leading to faster model convergence and various behavioral benefits in tasks including spatial navigation and cue-reward association. Furthermore, studies have investigated how the neuromodulatory system, a major driver of synaptic plasticity and state-dependent changes in the brain neuronal activities, plays a role in training deep neural networks (DNNs). In this study, we extended previous studies on neuromodulation-inspired DNNs and explored the effects of neuromodulatory components on learning and single unit activities in a spatial learning task. Under the multiscale neuromodulatory framework, plastic components, dropout probability modulation, and learning rate decay were added to the single unit, layer, and whole network levels of DNN models, respectively. We observed behavioral benefits including faster learning and smaller error of ambulation. We then concluded that neuromodulatory components can affect learning trajectories, outcomes, and single unit activities, in a component- and hyperparameter-dependent manner.

iScience

Jan 23, 2023
Jie Mei + 2

Multi-label classification of chest X-ray images with pre-trained vision Transformer model

目的基于计算机的胸腔X线影像疾病检测和分类目前存在误诊率高，准确率低的问题。本文在视觉Transformer（vision Transformer，ViT）预训练模型的基础上，通过迁移学习方法，实现胸腔X线影像辅助诊断，提高诊断准确率和效率。方法选用带有卷积神经网络（convolutional neural network，CNN）的ViT模型，其在超大规模自然图像数据集中进行了预训练；通过微调模型结构，使用预训练的ViT模型参数初始化主干网络，并迁移至胸腔X线影像数据集中再次训练，实现疾病多标签分类。结果在IU X-Ray数据集中对ViT迁移学习前、后模型平均AUC（area under ROC curve）得分进行对比分析实验。结果表明，预训练ViT模型平均AUC得分为0.774，与不使用迁移学习相比提升了0.208。并针对模型结构和数据预处理进行了消融实验，对ViT中的注意力机制进行可视化，进一步验证了模型有效性。最后使用Chest X-Ray14和CheXpert数据集训练微调后的ViT模型，平均AUC得分为0.839和0.806，与对比方法相比分别有0.014～0.031的提升。结论与其他方法相比，ViT模型胸腔X线影像的多标签分类精确度更高，且迁移学习可以在降低训练成本的同时提升ViT模型的分类性能和泛化性。消融实验与模型可视化表明，包含CNN结构的ViT模型能重点关注有意义的区域，高效获取胸腔X线影像的视觉特征。;Objective The chest X-ray-relevant screening and diagnostic method is essential for radiology nowadays. Most of chest X-ray images interpretation is still restricted by clinical experience and challenged for misdiagnose and missed diagnoses. To detect and identify one or more potential diseases in images automatically，it is beneficial for improving diagnostic efficiency and accuracy using computer-based technique. Compared to natural images，multiple lesions are challenged to be detected and distinguished accurately in a single image because abnormal areas have a small proportion and complex representations in chest X-ray images. Current convolutional neural network（CNN）based deep learning models have been widely used in the context of medical imaging. The structure of the CNN convolution kernel has sensitive to local detail information，and it is possible to extract richer image features. However，the convolution kernel cannot be used to get global information，and the features-extracted are restricted of redundant information like its relevance of background， muscles，and bones. The model’s performance in multi-label classification tasks are affected to a certain extent. At present，the vision Transformer（ViT）model has achieved its priorities in computer vision-related tasks. The ViT can be used to capture information simultaneously and effectively for multiple regions of the entire image. However，it is required to use large-scale dataset training to achieve good performance. Due to some factors like patient privacy and manual annotate costs，the size of the chest X-ray image data set has been limited. To reduce the model's dependence on data scale and improve the performance of multi-label classification，we develop the CNN-based ViT pre-training model in terms of the transfer learning method for diagnosis-assisted of chest X-ray image and multi-label classification. Method The CNN-based ViT model is pre-trained on a huge scale ground truth dataset，and it is used to obtain the initial parameters of the model. The model structure is fine-tuned according to the features of chest X-ray dataset. A 1×1 convolution layer is used to convert the chest X-ray images channels between 1 to 3. The number of output nodes of the linear layer in the classifier is balanced from 1 000 to the number of chest X-ray classification labels，and the Sigmoid is used as an activation function. The parameters of the backbone network are initialized in terms of the pre-trained ViT model parameters，and it is trained in the chest X-ray dataset after that to complete multi-label classification. The experiment is configured of Python3. 7 and PyTorch1. 8 to construct the model and RTX3090 GPU for training. Stochastic gradient descent（SGD）optimizer，binary cross-entropy（BCE）loss function，an initial learning rate of 1E-3，the cosine annealing learning rate decay are used. For training，each image is scaled to a size of 512×512 pixels，and a 224×224 pixels area and it is then cropped in random as the model input，and data augmentation is performed randomly by some of the flipping，perspective transformation， shearing，translation，zooming，and changing brightness. For testing，the chest X-ray image is scaled to 256×256 pixels and center crop a 224×224 area to input the trained model. Result The experiment is performed on the IU X-Ray，which is a small-scale chest X-ray dataset. This model is evaluated in quantitative using the average of area under ROC curve （AUC）scores across all classification labels. The results show that the average AUC score of the pre-trained ViT model is 0. 774. The accuracy and training efficiency of the non-pre-trained ViT model is dropped significantly. The average AUC score is reached to 0. 566 only，which is 0. 208 lower. In addition，the attention mechanism heat map is generated based on the ViT model，which can strengthen the interpretability of the model. A series of ablation experiments are carried out for data augmentation，model structure，and batch size design. The fine-tuned ViT model is trained on the Chest-Ray14 and CheXpert dataset as well. The average AUC score is reached to 0. 839 and 0. 806，which is optimized by 0. 014 and 0. 031. Conclusion A pre-trained ViT model is used for the multi-label classification of chest X-ray images via transfer learning. The experimental results illustrate that the ViT has its stronger multi-label classification performance in chest Xray images，and its attention mechanism is beneficial for lesions precision-focused like the interior of the chest cavity and the heart. Transfer learning is potential to improve the classification performance and model generalization of the ViT in small-scale datasets，and the training cost is reduced greatly. Ablation experiments demonstrate that the incorporated model of CNN and Transformer has its priority beyond single-structure model. Data enhancement and the batch size cutting can improve the performance of the model，but smaller scale of batch is still interlinked to longer training span. To improve the model's ability，we predict that future research direction can be focused on the extraction for complex disease and highlevel semantic information，such as their small lesions，disease location，and severity.

Journal of Image and Graphics

Jan 1, 2023
Suxia Xing + 4

Control Distance IoU and Control Distance IoU Loss for Better Bounding Box Regression

Numerous improvements in feedback mechanisms have contributed to the great progress in object detection. In this paper, we first present an evaluation-feedback module, which consists of an evaluation system and feedback mechanism. Then we analyze and summarize traditional evaluation-feedback modules. We focus on both the evaluation system and the feedback mechanism, and propose Control Distance IoU and Control Distance IoU loss function (CDIoU and CDIoU loss) without increasing parameters in models, which make significant enhancements on several classical and emerging models. Finally, we propose Automatic Ground Truth Clustering (AGTC) and Floating Learning Rate Decay (FLRD) for faster regression in object detection. Experiments show that a coordinated evaluation-feedback module can effectively improve model performance. Both CNN and transformer-based detectors with CDIoU + CDIoU loss, AGTC, and FLRD achieve excellent performances. There are a maximum AP improvement of 2.9%, an average AP of 1.1% improvement on MS COCO, a maximum AP improvement of 8.2%, and an average AP improvement of 3.7% on Visdrone dataset.

Pattern Recognition

Dec 21, 2022
Chen Dong + 1

Galaxy Morphology Classification with DenseNet

Galaxy classification is crucial in astronomy, as galaxy types reveal information on how the galaxy was formed and evolved. While manually conducting the classification task requires extensive background knowledge and is time-consuming, deep learning algorithms provide a time-efficient and expedient way of accomplishing this task. Hence, this paper utilizes transfer learning from pre-trained CNN models and compares their performances on the Galaxy10 DECals Dataset. This paper applies opening operation, data augmentation, class weights, and learning rate decay to further improve the models’ performance. In our experiments, DenseNet121 outperforms the other models and achieved approximately 89% test-set accuracy within 30 minutes. The second best-performing model, EfficientNetV2S, takes double the time achieving 2.43% lower test set accuracy.

Journal of Physics: Conference Series

Dec 1, 2022
Wuyu Hui + 3

A Scaling Transition Method from SGDM to SGD with 2ExpLR Strategy

In deep learning, the vanilla stochastic gradient descent (SGD) and SGD with heavy-ball momentum (SGDM) methods have a wide range of applications due to their simplicity and great generalization. This paper uses an exponential scaling method to realize a smooth and stable transition from SGDM to SGD, which combines the advantages of the fast training speed of SGDM and the accurate convergence of SGD (named TSGD). We also provide some theoretical results on the convergence of this algorithm. At the same time, we take advantage of the learning rate warmup strategy’s stability and the learning rate decay strategy’s high accuracy. A warmup–decay learning rate strategy with double exponential functions is proposed (named 2ExpLR). The experimental results on different datasets for the proposed algorithms indicate that the accuracy is improved significantly and that the training is faster and more stable.

Applied Sciences

Nov 24, 2022
Kun Zeng + 3

Surface roughness prediction of aircraft after coating removal based on optical image and deep learning

To quickly evaluate the surface quality of aircraft after coating removal, a surface roughness prediction method based on optical image and deep learning model is proposed. In this paper, the "optical image-surface roughness" data set is constructed, and SSEResNet for regression prediction of surface roughness is designed by using feature fusion method. SSEResNet can effectively extract the detailed features of optical images, and Adam method is used for training optimization. Experiments show that the proposed model outperforms the other seven CNN backbone networks compared. This paper also investigates the effect of four different learning rate decay strategies on model training and prediction performance. The results show that the learning rate decay method of Cosine Annealing with warm restart has the best effect, its test MAE value is 0.245 μm, and the surface roughness prediction results are more consistent with the real value. The work of this paper is of great significance to the removal and repainting of aircraft coatings.

Scientific Reports

Nov 12, 2022
Qichun Hu + 2

Variational Hyperparameter Inference for Few-Shot Learning Across Domains

The focus of few shot learning research has been on the development of meta-learning recently, where a meta-learner is trained on a variety of tasks in hopes of being generalizable to new tasks. Tasks in meta training and meta test are usually assumed to be from the same domain, which would not necessarily hold in real world scenarios. In this paper, we propose variational hyperparameter inference for few-shot learning across domains. Based on an especially successful algorithm named model agnostic meta learning, the proposed variational hyperparameter inference integrates meta learning and variational inference into the optimization of hyperparameters, which enables the meta-learner with adaptivity for generalization across domains. In particular, we choose to learn adaptive hyperparameters including the learning rate and weight decay to avoid the failure in the face of few labeled examples across domain. Moreover, we model hyperparameters as distributions instead of fixed values, which will further enhance the generalization ability by capturing the uncertainty. Extensive experiments are conducted on two benchmark datasets including few shot learning dataset within-domain and across-domain. The results demonstrate that our methods outperforms previous approaches consistently, and comprehensive ablation studies further validate its effectiveness on few shot learning both within domains and across domains.

IEEE Transactions on Circuits and Systems for Video Technology

Nov 1, 2022
Lei Zhang + 4