Abstract
The classical convolution neural network architecture adheres to static declaration procedures, which means that the shape of computation is usually predefined and the computation graph is fixed. In this research, the concept of a pluggable micronetwork, which relaxes the static declaration constraint by dynamic layer configuration relay, is proposed. The micronetwork consists of several parallel convolutional layer configurations and relays only the layer settings, incurring a minimum loss. The configuration selection logic is based on the conditional computation method, which is implemented as an output layer of the proposed micronetwork. The proposed micronetwork is implemented as an independent pluggable unit and can be used anywhere on the deep learning decision surface with no or minimal configuration changes. The MNIST, FMNIST, CIFAR-10 and STL-10 datasets have been used to validate the proposed research. The proposed technique is proven to be efficient and achieves appropriate validity of the research by obtaining state-of-the-art performance in fewer iterations with wider and compact convolution models. We also naively attempt to discuss the involved computational complexities in these advanced deep neural structures.
Highlights
T HE motivation of this research arises from identifying the challenges posed by fixed deep learning architectures and the inability to operate decisively in a static flow environment
We added the output layer, which calculates the minimum loss incurred by multiple layers and dynamically relays the layer configuration, incurring minimum loss for the remaining training process pipeline
We have added the pluggable micronetwork as a layer in our custom model (Figure 7: MNIST experiment)
Summary
T HE motivation of this research arises from identifying the challenges posed by fixed deep learning architectures and the inability to operate decisively in a static flow environment. The model sometimes cannot learn the powerful visual indicators or sometimes learns the noise as features in the training input to such a level that it dramatically compromises the model’s ability to discriminate on new data and results in poor performance. This finding can be interpreted in that the discriminatory features are disregarded or the noise and that random swings and shifts in the training data are taken and learned as concepts by the static model. The redesigned NIN layer operates by adding the basic multilayer perceptron (MLP) network architecture to make it flexible enough to determine the appropriate configuration by using a simple logic of anticipating the performance of each convolution setting before forwarding the convolution output
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.