Abstract

Classification task is one of the most common tasks in machine learning. This supervised learning problem consists in assigning each input to one of a finite number of discrete categories. Classification task appears naturally in numerous applications, such as medical image processing, speech recognition, maintenance systems, accident detection, autonomous driving etc.In the last decade methods of deep learning have proven to be extremely efficient in multiple machine learning problems, including classification. Whereas the neural network architecture might depend a lot on data type and restrictions posed by the nature of the problem (for example, real-time applications), the process of its training (i.e. finding model’s parameters) is almost always presented as loss function optimization problem.Cross-entropy is a loss function often used for multiclass classification problems, as it allows to achieve high accuracy results.Here we propose to use a generalized version of this loss based on Renyi divergence and entropy. We remark that in case of binary labels proposed generalization is reduced to cross-entropy, thus we work in the context of soft labels. Specifically, we consider a problem of image classification being solved by application of convolution neural networks with mixup regularizer. The latter expands the training set by taking convex combination of pairs of data samples and corresponding labels. Consequently, labels are no longer binary (corresponding to single class), but have a form of vector of probabilities. In such settings cross-entropy and proposed generalization with Renyi divergence and entropy are distinct, and their comparison makes sense.To measure effectiveness of the proposed loss function we consider image classification problem on benchmark CIFAR-10 dataset. This dataset consists of 60000 images belonging to 10 classes, where images are color and have the size of 32×32. Training set consists of 50000 images, and the test set contains 10000 images.For the convolution neural network, we follow [1] where the same classification task was studied with respect to different loss functions and consider the same neural network architecture in order to obtain comparable results.Experiments demonstrate superiority of the proposed method over cross-entropy for loss function parameter value α < 1. For parameter value α > 1 proposed method shows worse results than cross-entropy loss function. Finally, parameter value α = 1 corresponds to cross-entropy.

Highlights

  • In recent years, deep learning methods have been showing steady success in various areas of applications, such as computer vision, natural language processing, autonomous driving etc

  • In our work we propose the use of Renyi entropy and divergence as a loss function for classification problem

  • We follow the experiment outline given in [1], where a number of different loss functions were compared in the context of image classification problem

Read more

Summary

Introduction

Deep learning methods have been showing steady success in various areas of applications, such as computer vision, natural language processing, autonomous driving etc. This success can be attributed to high performance of deep learning models in comparison to other methods, notably for unstructured data, such as images or text. In our work we propose the use of Renyi entropy and divergence as a loss function for classification problem. We follow the experiment outline given in [1], where a number of different loss functions were compared in the context of image classification problem. The following section describes the proposed method

Related work
Methodology
Experiments
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call