This study emanates from a simple observation: as specified by Vapnik [37] in his study, an artificial neural network cannot generate a universal approximator if the aggregation function chosen to design the artificial neuron does not include non-linearity. The usual option is to follow a linear aggregation by a non-linear function, or so-called activation function. We wonder if this approach could be replaced by one using a natively non-linear aggregation function.Among all of the available non-linear aggregation functions, here we are interested in aggregations based on weighted minimum and weighted maximum operations [8]. As these operators were originally developed within a possibility theory and fuzzy rule framework, such operators cannot be easily integrated into a neural network because the values that are usually considered belong to [0,1]. For gradient descent based learning, a neuron must be an aggregation function derivable with respect to its inputs and synaptic weights, whose variables (synaptic weights, inputs and outputs) must all be signed real values. We thus propose an extension of weighted maximum based aggregation to enable this learning process. We show that such an aggregation can be seen as a combination of four Sugeno integrals. Finally, we compare this type of approach with the classical one.
Read full abstract