Abstract

Swarm intelligence algorithms have been widely adopted in solving many highly nonlinear, multimodal problems and have achieved tremendous successes. However, their application on deep neural networks is largely unexplored. On the other hand, deep neural networks, especially convolutional neural network (CNN), have recently achieved breakthroughs in tackling many intractable problems; nevertheless their performance depends heavily on the chosen values of their hyper-parameters, whose fine-tuning is both labor-intensive and time-consuming. In this paper, we propose a novel particle swarm optimization (PSO) variant cPSO-CNN for optimizing the hyper-parameter configuration of architecture-determined CNNs. cPSO-CNN utilizes a confidence function defined by a compound normal distribution to model experts' knowledge on CNN hyper-parameter fine-tunings so as to enhance the canonical PSO's exploration capability. cPSO-CNN also redefines the scalar acceleration coefficients of PSO as vectors to better adapt for the variant ranges of CNN hyper-parameters. Besides, a linear prediction model is adopted for fast ranking the PSO particles to reduce the cost of fitness function evaluation. The experimental results demonstrate that cPSO-CNN performs competitively when compared with several reported algorithms in terms of both CNN hyper-parameter superiority and overall computation cost.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call