Abstract

Weakly supervised object localization (WSOL) aims at pre-dicting the location of objects with image-level labels. Fine-grained WSOL task has its characteristic challenge compared with generic object localization. The structural information of the objects in fine-grained benchmark has little relevance to class. Previous WSOL works mainly focus on learning the most class discriminative parts recursively, leading to se-rious structural feature missing issue. In this paper, we pro-pose a self-guided network (SGN) which consists of two branch deep classification networks. It adopts a coarse-to-fine strategy to detect the structural information of the ob-ject. First, we devise a self-adaptive method (SAM) to de-tect the most body structure of the object by directly leveraging the feature recognition ability of the first classifier. Then, an object structure generation (OSG) method is proposed in the fine localization phase. OSG helps the second classifier to learn the boundary feature of the object with less back-ground noise. Extensive experiments on four well-known fine-grained benchmarks, including CUB, FGVC Aircraft, Stanford Dogs, and Stanford Cars show that the proposed SGN outperforms the state-of-the-art WSOL methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call