Abstract

The sigmoid function is a widely used, bounded activation function for feedforward neural networks (FFNNs). A problem with using bounded activation functions is that it necessitates scaling of the data to suit the fixed domain and range of the function. Alternatively the activation function itself can be adapted by learning the gradient and range of the function alongside the FFNN weights. The purpose of this paper is to investigate whether the particle swarm optimization (PSO) algorithm is capable of training FFNNs that use adaptive sigmoid activation functions. The PSO algorithm is also compared against the gradient based lambda-gamma backpropagation learning algorithm (LG-BP) on five classification and five regression data sets. Experiments are conducted with scaled and unscaled input data as well as target output ranges of increasing size. The PSO algorithm proves capable of training adaptive activation function FFNNs and significantly outperforms the LG-BP algorithm on all problems. With the PSO, the use of adaptive activation functions improves the training accuracy of the FFNN, but leads to worse generalization performance due to overfitting. Increasing the size of the target output range increases the overfitting and worsens the generalization performance. Less overfitting is witnessed on data sets with unscaled input data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call