Abstract
Deep neural networks are integral part of every recent breakthrough in learning systems. Much of the share of this success goes to new and powerful computational devices, which lead to bigger and deeper models with higher learning capacity. Recently, a lot of new approaches have been developed to make it more convenient to train deeper models, and overcome the training problems like exploding and vanishing gradients. In this paper, we use a very unorthodox approach and propose novel mutually independent feature flow method for efficient architecture design. This approach decomposes the traditional bigger and continuously widening graphical model to fixed width graphs. It is achieved by decomposing the wider layers into conditionally independent branches with smaller width. Resulting model is a constrained parametric graph with fixed size and lower depth kernels. The process leads to huge reduction in model parameters in comparison to their traditional counterparts. This approach also provides low dimensional parameter space for gradient flow leading to faster convergence along with improved accuracy. Proposed approach is tested in three different styles of CNN architectures generally used in computer vision community. Performance is tested on four benchmark datasets of different complexity. Proposed approach not only provides huge reduction in model parameters but also leads to increased model accuracy.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.