Abstract
The cosine-based softmax loss functions greatly enhance intra-class compactness and perform well on the tasks of face recognition and object classification. Outperformance, however, depends on the careful hyperparameter selection. Adaptively Scaling Cosine Logits (AdaCos) tries to propose a parameter-free version by leveraging an adaptive scaling parameter. Nevertheless, the application of AdaCos is limited in specific domains because of improper approximation.In this paper, to promote intra-class compactness and interclass separability, we propose an Angular Gradient Margin Loss (ArcGrad) that generates a gradient margin by maximizing the angular gradient. Our work suggests that the margin parameter on cosine-based methods is not necessary, and the scaling parameter is inversely proportional to the margin. Furthermore, a stable and large gradient promotes better feature representation. In experiments, we test our method, as well as other methods enhancing discriminative information, on CIFAR and 15 datasets from UCI. Experimental results show ArcGrad consistently outperforms both on large and small scale problems and has the superiority in discriminative information and time-consumption.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.