Abstract
The influence of inaccurate knowledge still exists in the Semantic Web. The problem of knowledge inaccuracy in Knowledge Bases (KBs) is one of the largest obstacles that limit the development of Linked Open Data (LOD) and Knowledge Graphs (KGs). To solve the semantic ambiguity and improper classification of knowledge triples in the process of constructing Chinese online encyclopedia KBs, first, a new TF-AICL algorithm is proposed to calculate the concentration level of predicates in each top-category. Second, the predicate which can best represent the features of a top-category is selected, and the related predicate candidate set is extracted. Third, based on the positive and negative examples counting strategy, the predicate candidate set is used as the comparison group to filter each entity. Finally, based on the TF-AICL algorithm, this paper proposes a new iterative filtering method called IFTA. IFTA adopts a new predicate feature extraction method, TF-AICL, which considers the hierarchical features of the predicate. In addition, IFTA can automatically prune, filter and refine large-scale online encyclopedia knowledge in an iterative way. The precision, recall and F-measure results on the BaiduBaike and Hudong datasets indicate that the refining effects on open-domain Chinese encyclopedia KBs by the IFTA method outperform the state-of-the-art methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.