Adaptive Dimensional Gaussian Mutation of PSO-Optimized Convolutional Neural Network Hyperparameters

Chaoxue Wang,Tengteng Shi,Danni Han

doi:10.3390/app13074254

Chaoxue Wang, Tengteng Shi + Show 1 more

Open Access

PDF Available

https://doi.org/10.3390/app13074254

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

The configuration of the hyperparameters in convolutional neural networks (CNN) is crucial for determining their performance. However, traditional methods for hyperparameter configuration, such as grid searches and random searches, are time consuming and labor intensive. The optimization of CNN hyperparameters is a complex problem involving multiple local optima that poses a challenge for traditional particle swarm optimization (PSO) algorithms, which are prone to getting stuck in the local optima and achieving suboptimal results. To address the above issues, we proposed an adaptive dimensional Gaussian mutation PSO (ADGMPSO) to efficiently select the optimal hyperparameter configurations. The ADGMPSO algorithm utilized a cat chaos initialization strategy to generate an initial population with a more uniform distribution. It combined the sine-based inertia weights and an asynchronous change learning factor strategy to balance the global exploration and local exploitation capabilities. Finally, an elite particle adaptive dimensional Gaussian mutation strategy was proposed to improve the population diversity and convergence accuracy at the different stages of evolution. The performance of the proposed algorithm was compared to five other evolutionary algorithms, including PSO, BOA, WOA, SSA, and GWO, on ten benchmark test functions, and the results demonstrated the superiority of the proposed algorithm in terms of the optimal value, mean value, and standard deviation. The ADGMPSO algorithm was then applied to the hyperparameter optimization for the LeNet-5 and ResNet-18 network models. The results on the MNIST and CIFAR10 datasets showed that the proposed algorithm achieved a higher accuracy and generalization ability than the other optimization algorithms, such as PSO-CNN, LDWPSO-CNN, and GA-CNN.

Full Text