Abstract

The semantic web has enabled the creation of a growing number of knowledge bases (KBs), which are designed independently using different techniques. Integration of KBs has attracted much attention as different KBs usually contain overlapping and complementary information. Automatic techniques for KB integration have been improved but far from perfect. Therefore, in this paper, we study the problem of knowledge base semantic integration using crowd intelligence. There are both classes and instances in a KB, in our work, we propose a novel hybrid framework for KB semantic integration considering the semantic heterogeneity of KB class structures. We first perform semantic integration of the class structures via crowdsourcing, then apply the blocking-based instance matching approach according to the integrated class structure. For class structure (taxonomy) semantic integration, the crowd is leveraged to help identifying the semantic relationships between classes to handle the semantic heterogeneity problem. Under the conditions of both large scale KBs and limited monetary budget for crowdsourcing, we formalize the class structure (taxonomy) semantic integration problem as a Local Tree Based Query Selection (LTQS) problem. We show that the LTQS problem is NP-hard and propose two greedy-based algorithms, i.e., static query selection and adaptive query selection. Furthermore, the KBs are usually of large scales and have millions of instances, direct pairwise-based instance matching is inefficient. Therefore, we adopt the blocking-based strategy for instance matching, taking advantage of the class structure (taxonomy) integration result. The experiments on real large scale KBs verify the effectiveness and efficiency of the proposed approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call