Abstract

Knowledge Graphs (KGs) currently contain a vast amount of structured information in the form of entities and relations. Because KGs are often constructed automatically by means of information extraction processes, they may miss information that was either not present in the original source or not successfully extracted. As a result, KGs might lack useful and valuable information. Current approaches that aim to complete missing information in KGs have two main drawbacks. First, some have a dependence on embedded representations, which impose a very expensive preprocessing step and need to be recomputed again as the KG grows. Second, others are based on long random paths that may not cover all relevant information, whereas exhaustively analyzing all possible paths between entities is very time-consuming. In this paper, we present an approach to complete KGs based on evaluating candidate triples using a set of neighborhood-based features. Our approach exploits the highly connected nature of KGs by analyzing the entities and relations surrounding any given pair of entities, while avoiding full recomputations as new entities are added. Our results indicate that our proposal is able to identify correct triples with a higher effectiveness than other state-of-the-art approaches, achieving higher average F1 scores in all tested datasets. Therefore, we conclude that the information present in the vicinities of the two entities within a candidate triple can be leveraged to determine whether that triple is missing from the KG or not.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.