Fast convergence rates of deep neural networks for classification

Yongdai Kim,Ilsang Ohn,Dongha Kim

doi:10.1016/j.neunet.2021.02.012

Abstract

We derive the fast convergence rates of a deep neural network (DNN) classifier with the rectified linear unit (ReLU) activation function learned using the hinge loss. We consider three cases for a true model: (1) a smooth decision boundary, (2) smooth conditional class probability, and (3) the margin condition (i.e., the probability of inputs near the decision boundary is small). We show that the DNN classifier learned using the hinge loss achieves fast rate convergences for all three cases provided that the architecture (i.e., the number of layers, number of nodes and sparsity) is carefully selected. An important implication is that DNN architectures are very flexible for use in various cases without much modification. In addition, we consider a DNN classifier learned by minimizing the cross-entropy, and show that the DNN classifier achieves a fast convergence rate under the conditions that the noise exponent and margin exponent are large. Even though they are strong, we explain that these two conditions are not too absurd for image classification problems. To confirm our theoretical explanation, we present the results of a small numerical study conducted to compare the hinge loss and cross-entropy.

Highlights

Deep learning [Hinton and Salakhutdinov, 2006, Larochelle et al, 2007, Goodfellow et al, 2016] has received much attention for dimension reduction and classification of objects, such as images, speech, and language
Many researchers have demonstrated that deep neural networks (DNNs) are much more efficient in representing certain complex functions than their shallow counterparts [Montufar et al, 2014, Raghu et al, 2016, Eldan and Shamir, 2016], which has been reconfirmed by Yarotsky [2017] and Petersen and Voigtlaender [2018], who showed that DNNs can approximate a large class of functions, including even discontinuous functions with a parsimonious number of parameters
We prove that the estimation of a classifier based on the DNN with the hinge loss can achieve fast convergence rates under various situations

Summary

Introduction

Deep learning [Hinton and Salakhutdinov, 2006, Larochelle et al, 2007, Goodfellow et al, 2016] has received much attention for dimension reduction and classification of objects, such as images, speech, and language. We justify the use of the cross-entropy in learning a DNN by showing that the corresponding classifier achieves a fast convergence rate when most data have a conditional class probability close to 1 or zero. Note that this assumption is reasonable for image recognition because human beings recognize most real world images quite well.

Notations

Estimation of the classifier with DNNs

Necessity of the hinge loss

Learning DNN with the hinge loss

Fast convergence rates of DNN classifiers with the hinge loss

Case 1

Case 2

Case 3

Remarks regarding adpative estimation

Use of cross-entropy

Concluding Remarks

Complexity measures of a class of functions

Convergence rate of the excess φ-risk for general surrogate losses

Generic convergence rate for the hinge loss

Entropy of the class of DNNs

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Proof of Theorem 4

Proof of Proposition 4

A.10 DNN architectures used for the experiments

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Neural networks : the official journal of the International Neural Network Society	Publication Date: Feb 23, 2021
Citations: 34	License type: cc-by

R Discovery Prime

R Discovery Prime

Fast convergence rates of deep neural networks for classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Neural networks : the official journal of the International Neural Network Society

Lead the way for us

Similar Papers

Growing random forest on deep convolutional neural networks for scene categorization
Shuang Bai
Expert systems with applications | VOL. 71
Shuang BaiShuang Bai
17 Oct 2016
Expert systems with applications | VOL. 71

Classification of Bisyllabic Lexical Stress Patterns Using Deep Neural Networks
Mostafa Shahin ... Beena Ahmed
-
Mostafa Shahin, et. al.Mostafa Shahin ... Beena Ahmed
01 Jan 2015
01 Jan 2015

Early detection of infestation by mustard aphid, vegetable thrips and two-spotted spider mite in bok choy with deep neural network (DNN) classification model using hyperspectral imaging data
Derrick Nguyen ... Fadhlina Suhaimi
Computers and Electronics in Agriculture | VOL. 220
Derrick Nguyen, et. al.Derrick Nguyen ... Fadhlina Suhaimi
04 Apr 2024
Computers and Electronics in Agriculture | VOL. 220

Robust and Verifiable Information Embedding Attacks to Deep Neural Networks via Error-Correcting Codes
Jinyuan Jia ... Neil Zhenqiang Gong
-
Jinyuan Jia, et. al.Jinyuan Jia ... Neil Zhenqiang Gong
24 May 2021
24 May 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fast convergence rates of deep neural networks for classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Neural networks : the official journal of the International Neural Network Society