Abstract

Entity linking, the process of connecting textual mentions in documents to canonical entities within a knowledge base, plays an integral role in a myriad of natural language processing tasks. A significant challenge prevalent within the field is the scarcity of resources, particularly for multiple specialized domains, which accentuates the importance of few-shot entity linking in real-world scenarios. Previous works address the problem of lacking in-domain labeled data by generating synthetic data. However, we argue that the synthetic data is frequently far from high-quality, such low-quality instances will introduce noise and diminish the ability of entity linking models to comprehend the semantic consistency between mentions and entities. In this paper, we propose a H2FEL framework to introduce high correlation and high quality instances for few-shot entity linking. We argue that there are rich high-quality labeled data in general domains and some of them are highly correlated to the target domain. Thus, we first design an adversarial instance extraction module to extract such high-correlation instances without depending on additional manually annotated data. To further mitigate the negative effects brought by low-correlation instances, we train our entity linking model via a variant of curriculum learning. Experimental results on the few-shot entity linking dataset demonstrate the effectiveness of our proposed H2FEL framework and it achieves state-of-the-art performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.