Compressive Learning of Multi-layer Perceptrons: An Error Analysis

Ata Kaban

doi:10.1109/ijcnn.2019.8851743

Abstract

We consider the class of 2-layer feed-forward neural networks with sigmoidal activations – one of the oldest black-box learning machines – and ask the question: Under what conditions can it successfully learn from a random linear projection of the dataƒ Part of this question has been previously attempted in the literature: A high probability bound has been given on the absolute difference between the outputs of the network on the sample before and after random projection – provided that the target dimension is at least Ω(M2(log MN)), where M is the size of the hidden layer, and N is the number of training points. By contrast, in this paper we prove that a lower target dimension independent of both N and M suffices, not only to guarantee low distortion of the outputs but also to ensure good generalisation for learning the network on randomly projected data. We do not require a sparse representation of the data, instead our target dimension bound depends on the regularity of the problem expressed as norms of the weights. These are uncovered in our analysis by the use of random projection, which fulfils a regularisation role on the input layer weights.

Full Text