The Data Efficiency of Deep Learning Is Degraded by Unnecessary Input Dimensions.

Vanessa D'Amario,Tomotake Sasaki,Sanjana Srivastava,Xavier Boix

doi:10.3389/fncom.2022.760085

Vanessa D'Amario, Tomotake Sasaki + Show 2 more

Open Access

https://doi.org/10.3389/fncom.2022.760085

Copy DOI

Abstract

Biological learning systems are outstanding in their ability to learn from limited training data compared to the most successful learning machines, i.e., Deep Neural Networks (DNNs). What are the key aspects that underlie this data efficiency gap is an unresolved question at the core of biological and artificial intelligence. We hypothesize that one important aspect is that biological systems rely on mechanisms such as foveations in order to reduce unnecessary input dimensions for the task at hand, e.g., background in object recognition, while state-of-the-art DNNs do not. Datasets to train DNNs often contain such unnecessary input dimensions, and these lead to more trainable parameters. Yet, it is not clear whether this affects the DNNs' data efficiency because DNNs are robust to increasing the number of parameters in the hidden layers, and it is uncertain whether this holds true for the input layer. In this paper, we investigate the impact of unnecessary input dimensions on the DNNs data efficiency, namely, the amount of examples needed to achieve certain generalization performance. Our results show that unnecessary input dimensions that are task-unrelated substantially degrade data efficiency. This highlights the need for mechanisms that remove task-unrelated dimensions, such as foveation for image classification, in order to enable data efficiency gains.

Highlights

The success of Deep Neural Networks (DNNs) contrasts with the still distant goal of learning with few training examples as in biological systems, i.e., in a data efficient manner (Hassabis et al, 2017)
Since unnecessary input dimensions lead to more overparameterization, it is unclear in what way DNNs suffer from unnecessary input dimensions and whether more data is needed to learn to discard them
Increasing the number of task-unrelated dimensions leads to a substantial drop of data efficiency, while increasing the number of taskrelated dimensions that are linear combinations of other taskrelated dimensions, helps to alleviate the negative impact of the task-unrelated dimensions. These results suggest that mechanisms to discard unnecessary input dimensions, such as foveations for object recognition, are necessary to enable data efficiency gains

Summary

INTRODUCTION

The success of Deep Neural Networks (DNNs) contrasts with the still distant goal of learning with few training examples as in biological systems, i.e., in a data efficient manner (Hassabis et al, 2017). We introduce the hypothesis that an important aspect for data efficiency is that biological systems rely on mechanisms such as foveations in order to reduce unnecessary input dimensions, e.g., background in object recognition, while state-of-the-art DNNs do not. Increasing the number of task-unrelated dimensions leads to a substantial drop of data efficiency, while increasing the number of taskrelated dimensions that are linear combinations of other taskrelated dimensions, helps to alleviate the negative impact of the task-unrelated dimensions. These results suggest that mechanisms to discard unnecessary input dimensions, such as foveations for object recognition, are necessary to enable data efficiency gains

Object’s Background and DNN Generalization

Overparameterization and Data Dimensionality

UNNECESSARY INPUT DIMENSIONS AND DATA EFFICIENCY

Linearly Separable Dataset

Non-linearly Separable Dataset With Different Noise Distributions

Object Recognition Datasets

CONCLUSIONS

DATA AVAILABILITY STATEMENT

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in computational neuroscience	Publication Date: Jan 31, 2022
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

The Data Efficiency of Deep Learning Is Degraded by Unnecessary Input Dimensions.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in computational neuroscience

Lead the way for us

Similar Papers

The 3-dimensional medical image recognition of right and left kidneys by deep GMDH-type neural network
Tadashi Kondo ... Shoichiro Takao
-
Tadashi Kondo, et. al.Tadashi Kondo ... Shoichiro Takao
01 Nov 2015
01 Nov 2015

Artificial intelligence in interdisciplinary life science and drug discovery research.
Jürgen Bajorath
Future Science OA | VOL. 8
Jürgen BajorathJürgen Bajorath
08 Mar 2022
Future Science OA | VOL. 8

Breast cancer diagnosis using multiple activation deep neural network
K Vijayakumar ... Vinod J Kadam
Concurrent Engineering | VOL. 29
K Vijayakumar, et. al.K Vijayakumar ... Vinod J Kadam
25 Jun 2021
Concurrent Engineering | VOL. 29

A convergence analysis of Nesterov’s accelerated gradient method in training deep linear neural networks
Xin Liu ... Zhisong Pan
Information sciences | VOL. 612
Xin Liu, et. al.Xin Liu ... Zhisong Pan
05 Sep 2022
Information sciences | VOL. 612

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Data Efficiency of Deep Learning Is Degraded by Unnecessary Input Dimensions.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in computational neuroscience