Diverse classifiers with label dependencies for long-tail relation extraction in big data

Jing Qiu,Yang Lin,Hao Chen,Du Cheng,Muhammad Shafiq,Lejun Zhang,Zhihong Tian

doi:10.1016/j.compeleceng.2023.108812

Abstract

Relation extraction is a critical step in knowledge recommendation for big data, but long-tailed distribution in real-world relations presents a significant challenge. Most relations fail to gather enough training instances, which forms a long tail in data distribution and leads to poor performance on these relations. Previous studies have made efforts to improve models for long-tailed relations by sharing knowledge from head classes to the long-tail. Despite proven effectiveness on long tail relations, this line of work lacks control over the knowledge transfer process, which can harm performance on head classes. To address this issue, we propose an approach that enhances the label hierarchical dependencies of a classifier through label-to-sentence attention with multi-granular constraints across different levels of relation. Moreover, We introduce an ensemble mechanism that uses a router module to balance performance between head and tail classes, thereby fixing the long-tail problem on relation extraction data set. Our approach achieves excellent performance on long-tail and all relations on the large-scale benchmark data set New York Times, without sacrificing performance on head relations. The experiment results demonstrate that our approach effectively alleviates long-tail problems and boosts performance on long-tail classes in relation extraction without harming the performance on head classes.

Full Text