Addressing the multi-label imbalance for neural networks: An approach based on stratified mini-batches

Dunlu Peng,Tianfei Gu,Xue Hu,Cong Liu

doi:10.1016/j.neucom.2020.12.122

Abstract

Imbalanced learning in the multi-label scenario is a challenging issue, and it also exists in the training of deep neural network. Previous studies have demonstrated that the resampling methods are capable of reducing bias towards the majority group. Nonetheless, when being extended to neural network, these methods display some obvious drawbacks, such as the introduction of extra hyperparameters, the fixed training mode, etc. In order to eliminate the disadvantages, in this paper, an efficient training technique named Mini-Batch Gradient Descent with Stratified sampling (MBGD-Ss) is proposed to alliviate the issue with imbalanced data by dynamically sampling. In view of the specialty of multi-label domain, we put forward two specific strategies as Label Powerset based (SsLP) and Label-based (SsL), respectively. Particularly, SsLP takes the label combination (labelset) that appears in the dataset as a stratum, and SsL directly sets the label as a stratum. Extensive experiments validate the effectiveness of the proposed approach in decreasing the imbalance of sampled data. Moreover, the empirical analysis also shows that the proposed method can mitigate the classifier’s bias against labels, especially improve the prediction accuracy of minority labels.

Full Text