Abstract

In practical applications of image classification, methods based on convolutional neural networks are used as standard nowadays. However, the success of these methods is not explained sufficiently from a theoretical point of view. For this, a statistical model for image classification, namely a hierarchical max-pooling model with additional local pooling, is introduced. Here the additional local pooling enables the hierarchical model to combine parts of the image which have a variable relative distance towards each other. Various convolutional neural network image classifiers are introduced and analyzed in view of their rate of convergence of the misclassification risk of the estimates towards the optimal misclassification risk. This analysis provides a theoretical explanation why general convolutional neural network architectures that include some kind of local pooling layers are useful in some image classification situations and gives theoretical hints for choosing the right network architecture. Furthermore, the finite sample size performance of these network architectures is illustrated by applying them to simulated and real data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call