Abstract

Weakly supervised object localization locates objects based on the localization map generated from the classification network. However, most existing methods utilize the information of the target class to locate objects based on the feature map of a single image, which ignores both the relationships of inter-class and intra-class. In this work, we propose a Gradient-based Refined Class Activation Map (GRCAM) approach to achieve more accurate localization. Two kinds of gradients are applied to reveal the relationships of inter-class and intra-class during the testing stage. First, we exploit the gradients of the classification loss function concerning the feature map to enhance class-specific information. The gradients of classification loss reveal the connection among the predicted probabilities of all classes. Second, we design a regression function that refers to the loss between the pseudo-bounding box coordinates containing category consistency and the predicted coordinates generated from the localization map. The predicted coordinates are revised by the gradients of the regression function. The gradients of the regression function reveal the consistency within a class. Despite the apparent simplicity, we demonstrate the advantages of GRCAM on ILSVRC and CUB-200-2011 in extensive experiments. Especially, on ILSVRC dataset, the proposed GRCAM achieves a new state-of-the-art Top-1 localization error of 42.94%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call