Abstract

Training deep neural networks (DNNs) on high-dimensional data with no spatial structure poses a major computational problem. It implies a network architecture with a huge input layer, which greatly increases the number of weights, often making the training infeasible. One solution to this problem is to reduce the dimensionality of the input space to a manageable size, and then train a deep network on a representation with fewer dimensions. Here, we focus on performing the dimensionality reduction step by randomly projecting the input data into a lower-dimensional space. Conceptually, this is equivalent to adding a random projection (RP) layer in front of the network. We study two variants of RP layers: one where the weights are fixed, and one where they are fine-tuned during network training. We evaluate the performance of DNNs with input layers constructed using several recently proposed RP schemes. These include: Gaussian, Achlioptas’, Li’s, subsampled randomized Hadamard transform (SRHT) and Count Sketch-based constructions. Our results demonstrate that DNNs with RP layer achieve competitive performance on high-dimensional real-world datasets. In particular, we show that SRHT and Count Sketch-based projections provide the best balance between the projection time and the network performance.

Highlights

  • Deep-learning methods excel in many classical machine learning tasks, such as image and speech recognition or sequence modelling [1]

  • We study two ways of training this architecture: one where the parameters of the random projection (RP) layer are fixed during training, and one where they are fine-tuned with error backpropagation

  • We studied the viability of training deep neural networks with random projection layer

Read more

Summary

Introduction

Deep-learning methods excel in many classical machine learning tasks, such as image and speech recognition or sequence modelling [1]. The motivation for this work stems from the problem of training DNNs on unstructured data with a large number of dimensions. When there is no exploitable input structure, training DNNs on highdimensional data poses a significant computational problem The reason for this is the implied network architecture, and in particular an input layer which may contain billions of weights. Even with recent advances in GPGPU computing, training networks with this number of parameters is infeasible Learning in such applications is often performed with linear classifiers, usually support vector machines or logistic regression [3]. We show that this problem can be solved by incorporating random projection into the network architecture.

Random projection matrices
Neural networks with random projection layer
Fixed‐weight random projection layer
Fine‐tuned random projection layer
Experiments on synthetic datasets
Experiments on real‐world datasets
Related work
Findings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.