The deep learning techniques have received great achievements in computer vision, natural language processing, etc. The success of deep neural networks depends on the sufficient training of parameters. The traditional way of neural network training is a gradient-based algorithm, which suffers the disadvantage of gradient disappearing, especially for the deeper neural network. Recently, a heuristic algorithm has been proposed for deeper neural network optimization. In this paper, a random mask and elitism univariate continuous estimation of distribution algorithm based on the Gaussian model is proposed to pre-train staked auto-encoder, and then a Stochastic Gradient Descent (SGD) based fine-tuning process is carried out for local searching. In the improved estimation of the distribution algorithm, two individual update strategies are defined; one group of individuals is generated according to the constructed probabilistic model, and another is updated according to the statistics of advanced individuals that aim to reduce the probability of combination explosion and time consumption according to the mask information. In the simulations, different architectures, different mask ratios and different promising individual ratios are adopted to testify the effectiveness of the improved algorithm. According to simulation results, the estimation of thr distribution algorithm has a steady optimization ability for the shallow and stacked auto-encoder by one-step pre-training combining SGD based fine-tuning for the MNIST dataset. The proposed model will achieve a state-of-the-art performance on Fashion-MNIST.
Read full abstract