Abstract

The influence of inaccurate knowledge still exists in the Semantic Web. The problem of knowledge inaccuracy in Knowledge Bases (KBs) is one of the largest obstacles that limit the development of Linked Open Data (LOD) and Knowledge Graphs (KGs). To solve the semantic ambiguity and improper classification of knowledge triples in the process of constructing Chinese online encyclopedia KBs, first, a new TF-AICL algorithm is proposed to calculate the concentration level of predicates in each top-category. Second, the predicate which can best represent the features of a top-category is selected, and the related predicate candidate set is extracted. Third, based on the positive and negative examples counting strategy, the predicate candidate set is used as the comparison group to filter each entity. Finally, based on the TF-AICL algorithm, this paper proposes a new iterative filtering method called IFTA. IFTA adopts a new predicate feature extraction method, TF-AICL, which considers the hierarchical features of the predicate. In addition, IFTA can automatically prune, filter and refine large-scale online encyclopedia knowledge in an iterative way. The precision, recall and F-measure results on the BaiduBaike and Hudong datasets indicate that the refining effects on open-domain Chinese encyclopedia KBs by the IFTA method outperform the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call