Abstract
Various few-shot image classification methods indicate that transferring knowledge from other sources can improve the accuracy of the classification. However, most of these methods work with one single source or use only closely correlated knowledge sources. In this paper, we propose a novel weakly correlated knowledge integration (WCKI) framework to address these issues. More specifically, we propose a unified knowledge graph (UKG) to integrate knowledge transferred from different sources (i.e., visual domain and textual domain). Moreover, a graph attention module is proposed to sample the subgraph from the UKG with low complexity. To avoid explicitly aligning the visual features to the potentially biased and weakly correlated knowledge space, we sample a task-specific subgraph from UKG and append it as latent variables. Our framework demonstrates significant improvements on multiple few-shot image classification datasets.
Highlights
Deep learning approaches have achieved impressive performance on image classification tasks recently
We propose a weakly correlated knowledge integration (WCKI) framework which can leverage nonstructural and weakly correlated knowledge extracted from different sources to improve the few-shot classification performance
The extra cost is brought by two parts: the size incremental of graph Gpre caused by the auxiliary latent subgraph and the newly introduced graph attention module
Summary
Deep learning approaches have achieved impressive performance on image classification tasks recently. Some works (e.g., CADA-VAE[6], Soravit′s method[7], and ReViSE[8]) align the features from the visual feature domain to the textual feature domain Many of these methods intend to work on datasets (e.g., animal with annotation[11] and CUB[12]) that provide highly correlated and structural textual descriptions. Few such methods apply to datasets that only provide weakly correlated descriptions, e.g., the Mini-ImageNet and Tiered-ImageNet datasets In these datasets, the label descriptions are not strongly correlated with the visual properties of the corresponding classes. LSFS still requires the dataset to provide an extra hierarchical annotation of different classes, while MNE does not utilize information in the label description. American robin, Turdus migratorius (large American thrush having a rust-red breast and abdomen)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.