Limitations of shallow networks representing finite mappings

Věra Kůrková

doi:10.1007/s00521-018-3680-1

Abstract

Limitations of capabilities of shallow networks to efficiently compute real-valued functions on finite domains are investigated. Efficiency is studied in terms of network sparsity and its approximate measures. It is shown that when a dictionary of computational units is not sufficiently large, computation of almost any uniformly randomly chosen function either represents a well-conditioned task performed by a large network or an ill-conditioned task performed by a network of a moderate size. The probabilistic results are complemented by a concrete example of a class of functions which cannot be efficiently computed by shallow perceptron networks. The class is constructed using pseudo-noise sequences which have many features of random sequences but can be generated using special polynomials. Connections to the No Free Lunch Theorem and the central paradox of coding theory are discussed.

Full Text