Few-shot relation classification by context attention-based prototypical networks with BERT

Bei Hui,Liang Liu,Yuhui Nian,Xue Zhou,Jia Chen

doi:10.1186/s13638-020-01720-6

Abstract

Human-computer interaction under the cloud computing platform is very important, but the semantic gap will limit the performance of interaction. It is necessary to understand the semantic information in various scenarios. Relation classification (RC) is an import method to implement the description of semantic formalization. It aims at classifying a relation between two specified entities in a sentence. Existing RC models typically rely on supervised learning and distant supervision. Supervised learning requires large-scale supervised training datasets, which are not readily available. Distant supervision introduces noise, and many long-tail relations still suffer from data sparsity. Few-shot learning, which is widely used in image classification, is an effective method for overcoming data sparsity. In this paper, we apply few-shot learning to a relation classification task. However, not all instances contribute equally to the relation prototype in a text-based few-shot learning scenario, which can cause the prototype deviation problem. To address this problem, we propose context attention-based prototypical networks. We design context attention to highlight the crucial instances in the support set to generate a satisfactory prototype. Besides, we also explore the application of a recently popular pre-trained language model to few-shot relation classification tasks. The experimental results demonstrate that our model outperforms the state-of-the-art models and converges faster.

Highlights

In a cloud-computing scenario, human-computer interaction operations occur frequently [1,2,3,4]
Our solution is to score the instances in the support set via a context attention mechanism to highlight the importance of the instances. Another objective of this paper is to explore the pre-training language model bidirectional encoder representations from transformers (BERT) that is used for the few-shot relation classification (RC) task
The proposed model is denoted as Proto_CATT_BERT(CNN), which indicates that our model is composed of context attention-based prototypical networks and that the BERT is used as a pre-trained language model in the embedding layer of the model

Summary

Introduction

In a cloud-computing scenario, human-computer interaction operations occur frequently [1,2,3,4]. We find that not all instances are equal in support set when the prototypical networks are used for relation classification tasks. One of the main tasks of this paper is to generate a satisfactory prototype for a few-shot relation classification task in a text-based support set.

Results

Conclusion