With the complexity of problems in reality increasing, the sizes of deep learning neural networks, including the number of layers, neurons, and connections, are increasing in an explosive way. Optimizing hyperparameters to improve the prediction performance of neural networks has become an important task. In literatures, the methods of finding optimal parameters, such as sensitivity pruning and grid search, are complicated and cost a large amount of computation time. In this paper, a hyperparameter optimization strategy called junk neuron deletion is proposed. A neuron with small mean weight in the weight matrix can be ignored in the prediction, and is defined subsequently as a junk neuron. This strategy is to obtain a simplified network structure by deleting the junk neurons, to effectively shorten the computation time and improve the prediction accuracy and model the generalization capability. The LSTM model is used to train the time series data generated by Logistic, Henon and Rossler dynamical systems, and the relatively optimal parameter combination is obtained by grid search with a certain step length. The partial weight matrix that can influence the model output is extracted under this parameter combination, and the neurons with smaller mean weights are eliminated with different thresholds. It is found that using the weighted mean value of 0.1 as the threshold, the identification and deletion of junk neurons can significantly improve the prediction efficiency. Increasing the threshold accuracy will gradually fall back to the initial level, but with the same prediction effect, more operating costs will be saved. Further reduction will result in prediction ability lower than the initial level due to lack of fitting. Using this strategy, the prediction performance of LSTM model for several typical chaotic dynamical systems is improved significantly.
Read full abstract