Abstract
Employing additional prior knowledge to model local features as a final fine-grained object representation has become a trend for fine-grained object retrieval (FGOR). A potential limitation of these methods is that they only focus on common parts across the dataset (e.g. head, body or even leg) by introducing additional prior knowledge, but the retrieval of a fine-grained object may rely on category-specific nuances that contribute to category prediction. To handle this limitation, we propose an end-to-end Category-specific Nuance Exploration Network (CNENet) that elaborately discovers category-specific nuances that contribute to category prediction, and semantically aligns these nuances grouped by subcategory without any additional prior knowledge, to directly emphasize the discrepancy among subcategories. Specifically, we design a Nuance Modelling Module that adaptively predicts a group of category-specific response (CARE) maps via implicitly digging into category-specific nuances, specifying the locations and scales for category-specific nuances. Upon this, two nuance regularizations are proposed: 1) semantic discrete loss that forces each CARE map to attend to different spatial regions to capture diverse nuances; 2) semantic alignment loss that constructs a consistent semantic correspondence for each CARE map of the same order with the same subcategory via guaranteeing each instance and its transformed counterpart to be spatially aligned. Moreover, we propose a Nuance Expansion Module, which exploits context appearance information of discovered nuances and refines the prediction of current nuance by its similar neighbors, leading to further improvement on nuance consistency and completeness. Extensive experiments validate that our CNENet consistently yields the best performance under the same settings against most competitive approaches on CUB Birds, Stanford Cars, and FGVC Aircraft datasets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Proceedings of the AAAI Conference on Artificial Intelligence
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.