Detect Incorrect Triples in Knowledge Base Based on Triple Confidence Evaluation

Haihua Xie,Xiaojun Huang,Zhi Tang,Xiaoqing Lu

doi:10.1145/3133811.3133829

Abstract

The knowledge base is an important form of data storage and organization in the fields of knowledge service, and it is the basis of knowledge representation learning. The accuracy of the contents in the knowledge base determines the effectiveness of knowledge service applications. This study proposes a generic computational methodology to evaluate the confidence level of triples in knowledge bases and detect potentially incorrect ones for further verification. In our methodology, the confidence of a triple is evaluated based on weighted feature words that are able to characterize the subject-object relation embedded in the triple, and the feature words are extracted from a corpus of natural language sentences using statistical and natural language processing techniques. Based on the calculated confidence values of triples, incorrect triples are detected using machine-learning-based classification. An experiment on a data set of industry applications has been conducted to demonstrate the workflow of evaluating triple confidence and detecting in-correct triples using our methodology.

Full Text