A Study on the effects of recursive convolutional layers in convolutional neural networks

Alberto Rossi,Markus Hagenbuchner,Franco Scarselli,Ah Chung Tsoi

doi:10.1016/j.neucom.2021.07.021

Alberto Rossi, Markus Hagenbuchner + Show 2 more

https://doi.org/10.1016/j.neucom.2021.07.021

Copy DOI

Abstract

To overcome problems with the design of large networks, particularly with respect to the depth of the network, this paper presents a new model of convolutional neural networks (CNN) which features fully recursive convolutional layers (RCLs). An RCL is a generalization of the classic one-stage feedforward convolutional layer (CL) to fully direct feedback connections between the outputs of the CL and its inputs. A traditional deep CNN consisting of many CLs, can then be generalized to include some CLs, and some RCLs in the intermediate stages. We call the corresponding network a Convolutional Neural Network with Fully Recursive Perceptron Network (C-FRPN). Through an analysis of results obtained from applications of the C-FRPN to three benchmark image classification datasets: CIFAR-10, SVHN, ISIC, it is found that (i) in general, the performance of a C-FRPN, even with only one RCL, is better than the performance of the corresponding deep CNN with all CLs, under the constraint of having the same number of unknown parameters; (ii) the performance of the C-FRPN varies with respect to (a) where the RCLs are located, and (b) the number of RCLs in the C-FRPN; and, (iii) the effectiveness of the RCLs depends on the size of the training dataset. The results suggest that: (a) it is advisable to use RCLs particularly when training very large sets of data, (b) it is best to prioritize placement of RCLs close to the input layer of the C-FRPN, and (c) it is advisable to increase the number of RCLs as long as the training dataset can sustain without overfitting being observed.

Full Text