Abstract

Researchers and educators have long wrestled with the question of how best to teach their clients be they humans, non-human animals or machines. Here, we examine the role of a single variable, the difficulty of training, on the rate of learning. In many situations we find that there is a sweet spot in which training is neither too easy nor too hard, and where learning progresses most quickly. We derive conditions for this sweet spot for a broad class of learning algorithms in the context of binary classification tasks. For all of these stochastic gradient-descent based learning algorithms, we find that the optimal error rate for training is around 15.87% or, conversely, that the optimal training accuracy is about 85%. We demonstrate the efficacy of this ‘Eighty Five Percent Rule’ for artificial neural networks used in AI and biologically plausible neural networks thought to describe animal learning.

Highlights

  • Researchers and educators have long wrestled with the question of how best to teach their clients be they humans, non-human animals or machines

  • In this paper we address this issue of optimal training difficulty for a broad class of learning algorithms in the context of binary classification tasks, in which ambiguous stimuli must be classified into one of two classes

  • A major factor in determining the difficulty of this perceptual decision is the fraction of coherently moving dots, which can be manipulated by the experimenter to achieve a fixed error rate during training using a procedure known as ‘staircasing’[17]

Read more

Summary

Introduction

Researchers and educators have long wrestled with the question of how best to teach their clients be they humans, non-human animals or machines. These algorithms descend the gradient of error rate as a function of model parameters Such gradient-descent learning forms the basis of many algorithms in AI, from single-layer perceptrons to deep neural networks[10], and provides a quantitative description of human and animal learning in a variety of situations, from perception[11], to motor control[12] to reinforcement learning[13]. For these algorithms, we provide a general result for the optimal difficulty in terms of a target error rate for training. We demonstrate the applicability of the Eighty Five Percent Rule to artificial one- and two-layer neural networks[9,14], and a model from computational neuroscience that is thought to describe human and animal perceptual learning[11]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.