Abstract

Few-shot open-set recognition (FSOR) represents a relatively underexplored area of research. The primary challenge encountered by FSOR methods lies in recognizing known classes while simultaneously rejecting unknown classes utilizing only limited samples. Current FSOR methods predominantly rely on the visual information extracted from images to establish class representations, aiming to derive distinguishable classification scores for both known and unknown classes. However, these methods often overlook the benefits of leveraging semantic information derived from class names associated with images, which could provide valuable auxiliary learning insights. This study introduces a feature-semantic augmentation network to improve FSOR performance utilizing multimodal information. Specifically, we augment the class-specific features of closed-set prototypes by integrating visual and textual features from known class names across both local and global feature spaces. To facilitate prototype learning, We introduce a refinement and fusion module. Among these, the former leverages the similarity between prototype and target features at both channel and spatial dimensions to calibrate targets relative to their relevant prototypes. Meanwhile, the latter employs additional classification targets generated by the fusion module to provide learning sources from different classes. Experimental results on various few-shot learning benchmarks show that the proposed method significantly outperforms current state-of-the-art methods across both closed- and open-set scenarios.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.