Abstract

Biological learning systems are outstanding in their ability to learn from limited training data compared to the most successful learning machines, i.e., Deep Neural Networks (DNNs). What are the key aspects that underlie this data efficiency gap is an unresolved question at the core of biological and artificial intelligence. We hypothesize that one important aspect is that biological systems rely on mechanisms such as foveations in order to reduce unnecessary input dimensions for the task at hand, e.g., background in object recognition, while state-of-the-art DNNs do not. Datasets to train DNNs often contain such unnecessary input dimensions, and these lead to more trainable parameters. Yet, it is not clear whether this affects the DNNs' data efficiency because DNNs are robust to increasing the number of parameters in the hidden layers, and it is uncertain whether this holds true for the input layer. In this paper, we investigate the impact of unnecessary input dimensions on the DNNs data efficiency, namely, the amount of examples needed to achieve certain generalization performance. Our results show that unnecessary input dimensions that are task-unrelated substantially degrade data efficiency. This highlights the need for mechanisms that remove task-unrelated dimensions, such as foveation for image classification, in order to enable data efficiency gains.

Highlights

  • The success of Deep Neural Networks (DNNs) contrasts with the still distant goal of learning with few training examples as in biological systems, i.e., in a data efficient manner (Hassabis et al, 2017)

  • Since unnecessary input dimensions lead to more overparameterization, it is unclear in what way DNNs suffer from unnecessary input dimensions and whether more data is needed to learn to discard them

  • Increasing the number of task-unrelated dimensions leads to a substantial drop of data efficiency, while increasing the number of taskrelated dimensions that are linear combinations of other taskrelated dimensions, helps to alleviate the negative impact of the task-unrelated dimensions. These results suggest that mechanisms to discard unnecessary input dimensions, such as foveations for object recognition, are necessary to enable data efficiency gains

Read more

Summary

INTRODUCTION

The success of Deep Neural Networks (DNNs) contrasts with the still distant goal of learning with few training examples as in biological systems, i.e., in a data efficient manner (Hassabis et al, 2017). We introduce the hypothesis that an important aspect for data efficiency is that biological systems rely on mechanisms such as foveations in order to reduce unnecessary input dimensions, e.g., background in object recognition, while state-of-the-art DNNs do not. Increasing the number of task-unrelated dimensions leads to a substantial drop of data efficiency, while increasing the number of taskrelated dimensions that are linear combinations of other taskrelated dimensions, helps to alleviate the negative impact of the task-unrelated dimensions. These results suggest that mechanisms to discard unnecessary input dimensions, such as foveations for object recognition, are necessary to enable data efficiency gains

Object’s Background and DNN Generalization
Overparameterization and Data Dimensionality
UNNECESSARY INPUT DIMENSIONS AND DATA EFFICIENCY
Linearly Separable Dataset
Non-linearly Separable Dataset With Different Noise Distributions
Object Recognition Datasets
CONCLUSIONS
DATA AVAILABILITY STATEMENT
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call