ReLU Network with Bounded Width Is a Universal Approximator in View of an Approximate Identity

Sunghwan Moon

doi:10.3390/app11010427

Sunghwan Moon

Open Access

PDF Available

https://doi.org/10.3390/app11010427

Copy DOI

Export

Save

Cite

Journal: Applied Sciences	Publication Date: Jan 4, 2021
Citations: 6	License type: CC BY 4.0

Affiliation: Kyungpook National University

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Deep neural networks have shown very successful performance in a wide range of tasks, but a theory of why they work so well is in the early stage. Recently, the expressive power of neural networks, important for understanding deep learning, has received considerable attention. Classic results, provided by Cybenko, Barron, etc., state that a network with a single hidden layer and suitable activation functions is a universal approximator. A few years ago, one started to study how width affects the expressiveness of neural networks, i.e., a universal approximation theorem for a deep neural network with a Rectified Linear Unit (ReLU) activation function and bounded width. Here, we show how any continuous function on a compact set of Rnin,nin∈N can be approximated by a ReLU network having hidden layers with at most nin+5 nodes in view of an approximate identity.

Highlights

Over the past several years, deep neural networks have achieved state-of-the-art performance in a wide range of tasks such as image recognition/segmentation and machine translation
Most of the recent results on the universal approximation theory is about the Rectified Linear Unit (ReLU) network [5,13,14,15,16,17,18,19,20]
Lu et al [14] presented a universal approximation theorem for deep neural networks with ReLU activation functions and hidden layers with a bounded width in 2017, since the expressive power of depth in ReLU networks with a bounded width has received a lot of attention

Summary

Introduction

Over the past several years, deep neural networks have achieved state-of-the-art performance in a wide range of tasks such as image recognition/segmentation and machine translation (see the review article [1] and recent book [2] for more background). The Rectified Linear Units (ReLU) activation function is the most popular choice in practical use of the neural network [12] In this reason, most of the recent results on the universal approximation theory is about the ReLU network [5,13,14,15,16,17,18,19,20]. Cohen et al [13] provided the deep convolutional neural network with the ReLU activation function that cannot be realized by a shallow network if the number of nodes of its hidden layer is no more than an exponential bound. Lu et al [14] presented a universal approximation theorem for deep neural networks with ReLU activation functions and hidden layers with a bounded width in 2017, since the expressive power of depth in ReLU networks with a bounded width has received a lot of attention.

Main Result

Proof of Theorem 1

General-Dimensional Input

Conclusions