Abstract

Open Information Extraction (OpenIE) aims to construct expansive open knowledge bases (OKBs) by extracting triples (noun phrase, relation phrase, noun phrase) from unstructured text. One critical problem in OKBs is the lack of canonicalization for noun phrases and relation phrases, leading to the storage of redundant and ambiguous facts. Consequently, open knowledge base canonicalization, which clusters synonymous phrases into the same group, has emerged as an active research area. Existing approaches either leverage fact triples or source context in isolation, or at best, interact them at the clustering level. However, these approaches lack explicit interaction or only loosely couple the two types of knowledge, resulting in the potential loss of valuable intermediate information. In this paper, we propose MuFIC, a novel unsupervised framework that interacts the fact triples and source context at the feature level to address these limitations. In order to capture and integrate fine-grained fact and context knowledge, we design three levels of feature interaction: low-level context-guided feature interaction, mid-level fact-guided feature interaction, and high-level gated fusion feature interaction. Furthermore, we introduce an additional objective function via contrastive learning to improve the quality of extracted features and reduce knowledge-specific noise. Finally, we design a bidirectional feedback mechanism to better guide the learning process of joint features by harnessing side information prototype learning, and to dynamically optimize side information based on the clustering results formed by joint features. Extensive experiments on three public benchmark datasets demonstrate the superiority of our proposed framework.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.