Abstract

When the training dataset follows a long-tail distribution, models tend to prioritize the majority of the data, thus resulting in lower predictive accuracy for the minority data. Among existing methods, integrating multiple experts with different logit distributions has yielded promising results. However, the current state-of-the-art (SOTA) ensemble method, i.e., Self-supervised Aggregation of Diverse Experts, trains three expert models separately to favor the head, middle, and tail data, respectively, without imposing mutual constraints. Failure to constrain the magnitude of logits among experts may result in higher category entropy, making it difficult to achieve an optimal ensemble solution. To address this issue, we propose the Expert Constrained Multi-Expert Ensembles with Category Entropy Minimization method, which consists of two new strategies: (1) Confidence Enhancement Loss to constrain the expert models based on maximizing target and non-target logit margins, thereby minimizing category entropy; (2) Shot-aware Weights associated with expert models to accommodate the shot-headed characteristic of the experts. Experiments demonstrate that our method effectively reduces expert category entropy, improves integration effectiveness, and achieves SOTA results on three datasets in diverse test distributions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call