Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations

Boris Hanin

doi:10.3390/math7100992

Abstract

This article concerns the expressive power of depth in neural nets with ReLU activations and a bounded width. We are particularly interested in the following questions: What is the minimal width w min ( d ) so that ReLU nets of width w min ( d ) (and arbitrary depth) can approximate any continuous function on the unit cube [ 0 , 1 ] d arbitrarily well? For ReLU nets near this minimal width, what can one say about the depth necessary to approximate a given function? We obtain an essentially complete answer to these questions for convex functions. Our approach is based on the observation that, due to the convexity of the ReLU activation, ReLU nets are particularly well suited to represent convex functions. In particular, we prove that ReLU nets with width d + 1 can approximate any continuous convex function of d variables arbitrarily well. These results then give quantitative depth estimates for the rate of approximation of any continuous scalar function on the d-dimensional cube [ 0 , 1 ] d by ReLU nets with width d + 3 .

Highlights

Over the past several years, neural nets, deep nets, have become the state-of-the-art in a remarkable number of machine learning problems, from mastering go to image recognition/segmentation and machine translation.Despite all their practical successes, a robust theory of why they work so well is in its infancy
We show that every convex function on [0, 1]d that is piecewise affine with N pieces can be represented exactly by a ReLU net with width d + 1 and depth N
We considered in this article the expressive power of ReLU networks with bounded hidden layer widths

Summary

Introduction

Over the past several years, neural nets, deep nets, have become the state-of-the-art in a remarkable number of machine learning problems, from mastering go to image recognition/segmentation and machine translation (see the review article [1] for more background). ReLU nets of width w can approximate any positive convex function on [0, 1]d arbitrarily well (3). Theorem 1 addresses Q2 by providing quantitative estimates on the depth of a ReLU net with width d + 1 that approximates a given convex function. We prove that the depth of the network that computes such a function is bounded by the number affine pieces it contains. This extends the results of Arora-Basu-Mianjy-Mukherjee (e.g., Theorem 2.1 and Corollary 2.2 in [2]). We show that every convex function on [0, 1]d that is piecewise affine with N pieces can be represented exactly by a ReLU net with width d + 1 and depth N

Statement of Results

Relation to Previous Work

Proof of Theorem 2

Proof of Theorem 1

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematics	Publication Date: Oct 18, 2019
Citations: 224	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics

Lead the way for us

Similar Papers

An Empirical Study on Generalizations of the ReLU Activation Function
Chaity Banerjee ... Eduardo Pasiliao
-
Chaity Banerjee, et. al.Chaity Banerjee ... Eduardo Pasiliao
18 Apr 2019
18 Apr 2019

On the displacement for covering a d-dimensional cube with randomly placed sensors
Rafał Kapelko ... Evangelos Kranakis
Ad Hoc Networks | VOL. 40
Rafał Kapelko, et. al.Rafał Kapelko ... Evangelos Kranakis
11 Jan 2016
Ad Hoc Networks | VOL. 40

Approximation with spiked random networks
E Gelenbe ... Zhi-Hong Mao
-
E Gelenbe, et. al.E Gelenbe ... Zhi-Hong Mao
16 Dec 1998
16 Dec 1998

Topical unit classification using deep neural nets and probabilistic sampling
Gyorgy Kovacs ... Tamas Grosz
-
Gyorgy Kovacs, et. al.Gyorgy Kovacs ... Tamas Grosz
01 Oct 2016
01 Oct 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics