Abstract

Usability of data in millions of relational tables on the web can be improved by transforming them into a knowledge graph (KG). Unlike relational tables that possess a fixed number of columns and contain no explicit interlinking between entities they contain, column restrictions do not exist in KGs and each entity and relation are identified uniquely through the use of Uniform Resource Identifiers (URIs), which enables the creation of interlinks between entities. A complete process of such a transformation requires one to map (table) cells to (KG) entities, (table) columns to (KG) properties, i.e., binary relations, and the (whole) table to a (KG) class. The latter allows the subject entity of each row (e.g., countries in a country table) to be explicitly assigned an appropriate category. Unfortunately, most of the existing transformation methods only focus on mapping table cells to KG entities - the remaining two tasks are rarely addressed. In this work, we propose a table to KG transformation pipeline accomplishing all of those three tasks. Our approach differs from T2K-the only existing transformation approach that perform the above three tasks-in that T2K employs supervised learning models, while our pipeline consists of a number of heuristics that do not depend on ground truths in the T2D gold standard dataset. In the cell-to-entity mapping, the proposed method outperform T2K and achieve a comparable performance with TabEAno, an unsupervised approach specialized for this task. In the other two tasks, although we did not outperform T2K, the unsupervised nature of our approach means that dependency with gold standard data is not critical.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call