Most existing methods for relation extraction tasks depend heavily on large-scale annotated data; they cannot learn from existing knowledge and have low generalization ability. It is urgent for us to solve the above problems by further developing few-shot learning methods. Because of the limitations of the most commonly used CNN model which is not good at sequence labeling and capturing long-range dependencies, we proposed a novel model that integrates the transformer model into a prototypical network for more powerful relation-level feature extraction. The transformer connects tokens directly to adapt to long sequence learning without catastrophic forgetting and is able to gain more enhanced semantic information by learning from several representation subspaces in parallel for each word. We evaluate our method on three tasks, including in-domain, cross-domain and cross-sentence tasks. Our method achieves a trade-off between performance and computation and has an approximately 8% improvement in different settings over the state-of-the-art prototypical network. In addition, our experiments also show that our approach is competitive when considering cross-domain transfer and cross-sentence relation extraction in few-shot learning methods.
Read full abstract