Few-shot classification aims to learn a classifier that categorizes objects of unseen classes with limited samples. One general approach is to mine as much information as possible from limited samples. This can be achieved by incorporating data from multiple modalities. However, existing multi-modality methods only use additional modality in support samples while adhering to a single modal in query samples. Such approach could lead to information imbalance between support and query samples, which confounds model generalization from support to query samples. Towards this problem, we propose a task-adaptive semantic feature learning mechanism to incorporate semantic features for both support and query samples. The semantic feature learner is trained episodic-wisely by regressing from the feature vectors of the support samples. It is utilized to predict semantic features for the query samples. Such method maintains a consistent training scheme between support and query samples and enables direct model transfer from support to query data, which significantly improves model generalization. We conduct extensive experiments on four benchmarks in both inductive and transductive settings. Results show that the proposed TasNet outperforms state-of-the-art methods with an improvement of 1% to 5% in classification accuracy, demonstrating the superiority of our method. The exhaustive ablation studies further validate the effectiveness of our framework. The code is available at: https://github.com/pmhDL/TasNet
Read full abstract