A model is worth tens of thousands of examples for estimation and thousands for classification

Thomas Dagès,Laurent D Cohen,Alfred M Bruckstein

doi:10.1016/j.patcog.2024.110904

Thomas Dagès, Laurent D Cohen + Show 1 more

https://doi.org/10.1016/j.patcog.2024.110904

Copy DOI

Export

Save

Cite

Journal: Pattern Recognition

Publication Date: Aug 20, 2024

Abstract
Full-Text
Similar Papers

Abstract

Listen

Traditional signal processing methods relying on mathematical data generation models have been cast aside in favour of deep neural networks, which require vast amounts of data. Since the theoretical sample complexity is nearly impossible to evaluate, these amounts of examples are usually estimated with crude rules of thumb. However, these rules only suggest when the networks should work, but do not relate to the traditional methods. In particular, an interesting question is: how much data is required for neural networks to be on par or outperform, if possible, the traditional model-based methods? In this work, we empirically investigate this question in three simple examples covering estimation and classification, where the data is generated according to precisely defined mathematical models, and where well-understood optimal or state-of-the-art mathematical data-agnostic solutions are known. A first problem is deconvolving one-dimensional Gaussian signals, a second one is estimating a circle’s radius and location in random grayscale images of disks, and a third one both classifies the presence of a line and locates it when present in a binary random dot image. By training various networks, either naive custom designed or well-established ones, with various amounts of training data, we find that networks require tens of thousands of examples for estimation in comparison to the traditional methods and thousands for classification, whether the networks are trained from scratch or even with transfer-learning or finetuning.

Full Text