Training a deep-learning text classification model usually requires a large amount of labeled data, yet labeling data are usually labor-intensive and time-consuming. Few-shot text classification focuses on predicting unknown samples using only a few labeled samples. Recently, metric-based meta-learning methods have achieved promising results in few-shot text classification. They use episodic training in labeled samples to enhance the model’s generalization ability. However, existing models only focus on learning from a few labeled samples but neglect to learn from a large number of unlabeled samples. In this paper, we exploit the knowledge learned by the model in unlabeled samples to improve the generalization performance of the meta-network. Specifically, we introduce a novel knowledge distillation method that expands and enriches the meta-learning representation with self-supervised information. Meanwhile, we design a graph aggregation method that efficiently interacts the query set information with the support set information in each task and outputs a more discriminative representation. We conducted experiments on three public few-shot text classification datasets. The experimental results show that our model performs better than the state-of-the-art models in 5-way 1-shot and 5-way 5-shot cases.
Read full abstract