Open set task augmentation facilitates generalization of deep neural networks trained on small data sets

Wadhah Zai El Amri,Felix Reinhart,Wolfram Schenck

doi:10.1007/s00521-021-06753-6

Wadhah Zai El Amri, Felix Reinhart + Show 1 more

Open Access

PDF Available

https://doi.org/10.1007/s00521-021-06753-6

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Many application scenarios for image recognition require learning of deep networks from small sample sizes in the order of a few hundred samples per class. Then, avoiding overfitting is critical. Common techniques to address overfitting are transfer learning, reduction of model complexity and artificial enrichment of the available data by, e.g., data augmentation. A key idea proposed in this paper is to incorporate additional samples into the training that do not belong to the classes of the target task. This can be accomplished by formulating the original classification task as an open set classification task. While the original closed set classification task is not altered at inference time, the recast as open set classification task enables the inclusion of additional data during training. Hence, the original closed set classification task is augmented with an open set task during training. We therefore call the proposed approach open set task augmentation. In order to integrate additional task-unrelated samples into the training, we employ the entropic open set loss originally proposed for open set classification tasks and also show that similar results can be obtained with a modified sum of squared errors loss function. Learning with the proposed approach benefits from the integration of additional “unknown” samples, which are often available, e.g., from open data sets, and can then be easily integrated into the learning process. We show that this open set task augmentation can improve model performance even when these additional samples are rather few or far from the domain of the target task. The proposed approach is demonstrated on two exemplary scenarios based on subsets of the ImageNet and Food-101 data sets as well as with several network architectures and two loss functions. We further shed light on the impact of the entropic open set loss on the internal representations formed by the networks. Open set task augmentation is particularly valuable when no additional data from the target classes are available—a scenario often faced in practice.

Highlights

Machine learning algorithms have been a huge success in the field of image classification, image recognition and image processing
In order to integrate additional task-unrelated samples into the training, we employ the entropic open set loss originally proposed for open set classification tasks and show that similar results can be obtained with a modified sum of squared errors loss function
While we do not propose to use the sum of squared errors Ek for classification in practice, we investigate in this paper whether the principle benefit of open set task augmentation can be observed with an alternative loss function different to the entropic open set (EOS) loss

Summary

Introduction

Machine learning algorithms have been a huge success in the field of image classification, image recognition and image processing. State-of-the-art algorithms reach human-level or even superhuman performance [18] This success is based on an enormous amount of training data. In the OSTA condition, any deviation from a low, constant activation level is penalized for all output neurons for samples from the unknown class while the output of the neuron corresponding to the target class is maximized only for samples from the known classes. The difference between the output activations for samples from unknown classes (constantly low) in comparison to the output activations for samples from the unknown class (as high as possible) is maximized This favors more selective network responses to known samples and makes the neural representation more sparse as discussed earlier in Sect.

Methods

Results

Conclusion