Abstract

With the rapid development of research on machine learning models, especially deep learning, more and more endeavors have been made on designing new learning models with properties such as fast training with good convergence, and incremental learning to overcome catastrophic forgetting. In this paper, we propose a scalable wide neural network (SWNN), composed of multiple multi-channel wide RBF neural networks (MWRBF). The MWRBF neural network focuses on different regions of data and nonlinear transformations can be performed with Gaussian kernels. The number of MWRBFs for proposed SWNN is decided by the scale and difficulty of learning tasks. The splitting and iterative least squares (SILS) training method is proposed to make the training process easy with large and high dimensional data. Because the least squares method can find pretty good weights during the first iteration, only a few succeeding iterations are needed to fine tune the SWNN. Experiments were performed on different datasets including gray and colored MNIST data, hyperspectral remote sensing data (KSC, Pavia Center, Pavia University, and Salinas), and compared with main stream learning models. The results show that the proposed SWNN is highly competitive with the other models.

Highlights

  • It is well known that one model cannot give answers to all kinds of tasks

  • Recently, researchers have done many works on developing different types of models compared with deep learning models, and they are working improving the performances of previous learning models

  • A new model scalable wide neural network (SWNN) is proposed with scalable property, which is generated incrementally in the wide direction using the proposed multi-channel wide radial basis function (RBF) neural networks (MWRBF) networks

Read more

Summary

Introduction

It is well known that one model cannot give answers to all kinds of tasks. Kontschieder et al [4] proposed deep neural decision forests, which use random, differentiable decision trees to help the representation learning in hidden layers of deep networks. For complex learning tasks, learning models always become deep with many layers and a great number of parameters, often with good test performance and generalization. They are still poorly understood in a theoretical sense, and it is hard to characterize the training and generalization because of highly non-convex behaviour

Methods
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call