Abstract

Extreme learning machine is originally proposed for the learning of the single hidden layer feedforward neural network to overcome the challenges faced by the backpropagation (BP) learning algorithm and its variants. Recent studies show that ELM can be extended to the multilayered feedforward neural network in which the hidden node could be a subnetwork of nodes or a combination of other hidden nodes. Although the ELM algorithm with multiple hidden layers shows stronger nonlinear expression ability and stability in both theoretical and experimental results than the ELM algorithm with the single hidden layer, with the deepening of the network structure, the problem of parameter optimization is also highlighted, which usually requires more time for model selection and increases the computational complexity. This paper uses Cholesky factorization strategy and Givens rotation transformation to choose the hidden nodes of MELM and obtains the number of nodes more suitable for the network. First, the initial network has a large number of hidden nodes and then uses the idea of ridge regression to prune the nodes. Finally, a complete neural network can be obtained. Therefore, the ELM algorithm eliminates the need to manually set nodes and achieves complete automation. By using information from the previous generation’s connection weight matrix, it can be evitable to re-calculate the weight matrix in the network simplification process. As in the matrix factorization methods, the Cholesky factorization factor is calculated by Givens rotation transform to achieve the fast decreasing update of the current connection weight matrix, thus ensuring the numerical stability and high efficiency of the pruning process. Empirical studies on several commonly used classification benchmark problems and the real datasets collected from coal industry show that compared with the traditional ELM algorithm, the pruning multilayered ELM algorithm proposed in this paper can find the optimal number of hidden nodes automatically and has better generalization performance.

Highlights

  • Extreme learning machine (ELM) was first proposed by Huang and has attracted extensive attentions for its extremely fast learning speed, least human intervention, and easy implementation [1,2,3]

  • Decremental Learning Procedure of P-multilayered extreme learning machine (MELM). e process builds an optimum network structure, known as decremental learning of pruning multilayer ELM algorithm framework (P-MELM), through the calculation of the corresponding Cholesky factorization factor UL. It can be seen from formulas (14)–(20) that the P-MELM algorithm adopts a fast calculation format and can achieve a decreasing update of the connection weight matrix. e solution method of β1,L−1 based on Cholesky factorization and Givens rotation transformation makes full use of the information stored in the calculation of β1,L, where UL−1 can be obtained through Givens rotation transformation on the basis of UL, and BL−1 can be obtained directly on the basis of BL. erefore, if the number of hidden nodes decreases successively, β1,L−1 can be rapidly calculated by simple matrix arithmetic operation on the basis of computing β1,L

  • We have summarized the steps of the P-MELM algorithm as follows: Step 1: first, we limit the number of nodes. e maximum number of nodes is Lmax, the minimum number of nodes is Lmin, and the iteration stopping criterion is ξL. en, we set the initial number of hidden nodes as L Lmax and calculate AL and BL

Read more

Summary

Introduction

Extreme learning machine (ELM) was first proposed by Huang and has attracted extensive attentions for its extremely fast learning speed, least human intervention, and easy implementation [1,2,3]. E process builds an optimum network structure, known as decremental learning of P-MELM, through the calculation of the corresponding Cholesky factorization factor UL It can be seen from formulas (14)–(20) that the P-MELM algorithm adopts a fast calculation format and can achieve a decreasing update of the connection weight matrix. Erefore, if the number of hidden nodes decreases successively, β1,L−1 can be rapidly calculated by simple matrix arithmetic operation on the basis of computing β1,L In this case, if β1,L−1 is calculated by using the method shown in (5), it must be recalculated in the way of finding the inverse of the higher-order matrix, and it cannot be directly solved on the basis of computing β1,L. erefore, the P-MELM algorithm can guarantee the learning accuracy and further improve the training speed. MELM needs to recalculate the output weight every time once the number of hidden nodes is reduced. erefore, when the number of hidden layer nodes decreases gradually, P-MELM is simpler and takes less time than MELM

Algorithm Verification
A P-MELM model with M hidden layers and L hidden layer nodes is established
Conclusions and Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call