Abstract

As research attention in deep learning has been focusing on pushing empirical results to a higher peak, remarkable progress has been made in the performance race of machine learning applications in the past years. Yet deep learning based on artificial neural networks still remains difficult to understand as it is considered as a black-box approach. A lack of understanding of deep learning networks from the theoretical perspective would not only hinder the employment of them in applications where high-stakes decisions need to be made, but also limit their future development where artificial intelligence is expected to be robust, predictable and trustable. This paper aims to provide a theoretical methodology to investigate and train deep convolutional neural networks so as to ensure convergence. A mathematical model based on matrix representations for convolutional neural networks is first formulated and an analytic layer-wise learning framework for convolutional neural networks is then proposed and tested on several common benchmarking image datasets. The case studies show a reasonable trade-off between accuracy and analytic learning, and also highlight the potential of employing the proposed layer-wise learning method in finding the appropriate number of layers in actual implementations.

Highlights

  • Convolutional neural networks (CNNs) have been successfully utilized for various applications with image inputs such as image classification, pattern recognition, object detection, image segmentation

  • CNN models trained by BP have achieved great success, majority of the achievements are from empirical perspective

  • This demonstrates the possibility of using the layer-wise learning method as an indicator to determine the appropriate number of layers in final implementations of the models

Read more

Summary

INTRODUCTION

Convolutional neural networks (CNNs) have been successfully utilized for various applications with image inputs such as image classification, pattern recognition, object detection, image segmentation. The framework is limited to fully connect networks (FCNs) which cannot be used for full CNNs whose structure is different from the FCNs. In [22], [23], overparameterized networks were analyzed for deep learning by assuming a huge width in each inner layer. Though there is a trade-off in test accuracies in some case studies, the results show that some deep CNNs may not need as many convolutional layers as in their original structure to achieve reasonable accuracies. This demonstrates the possibility of using the layer-wise learning method as an indicator to determine the appropriate number of layers in final implementations of the models.

PROBLEM FORMATION
EQUATION OF DEEP CONVOLUTIONAL NEURAL
FORMULATION
CASE STUDIES
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.