In artificial neural network (ANN) learning, empirical risk can be expressed by training error, while structural risk can be expressed by network complexity. Learning from data is often seen as a tradeoff between these two risks. Additionally, balancing training error and validation error can overcome the overfitting problem to some extent. It is known that the network performance is also related to regularization term. In order to consider four factors, i.e. training error, validation error, network complexity and regularization term simultaneously in the training process of a single-hidden layer feedforword neural network (SLFN), a many-objective coevolutionary learning algorithm (MOCELA) integrated with extreme learning machine auto-encoder (ELMAE), called MOCELA-ELMAE is presented. In MOCELA, the non-dominated sorting genetic algorithm III (NSGA-III) is improved for handling the many-objective optimization model with hybrid variables, where binary coding is used for structure learning and real coding is utilized for representing input parameters, referring to all input weights and hidden biases of the AE network. Output parameters of AE, i.e. output weights are analytically calculated by the non-iterative learning rule. The network structure and connection parameters of SLFN are determined based on those of AE. A set of Pareto optimal solutions are eventually collected by the MOCELA-ELMAE, which represents multiple optimal SLFNs. To make the final decision, three best SLFNs with minimum validation errors are selected as the base classifiers for selective ensemble learning. Extensive experiments are implemented on the benchmark classification data sets of UCI machine learning repository, and obvious improvements have been observed when the proposed MOCELA-ELMAE is compared with the NSGA-III based on hybrid coding and completely non-iterative learning of SLFN respectively. The experimental results also illustrate that the MOCELA-ELMAE performs much better than other state-of-the-art learning algorithms on many data sets.
Read full abstract