Abstract

Semantic Table Annotation (STI) represents the process of annotating tabular data with concepts and relationships in a given knowledge graph. In recent years, there has been a lot of work on this task with excellent results. However, in the face of real table data with more and more naming rules and type relationships in the real world, many existing semantic table annotation systems based on unsupervised patterns cannot efficiently extract the deep features of the data for table cell matching. To efficiently extract complex features from real tabular data, in this paper, we propose a deep learning-based semantic table annotation system capable of completing table cell annotation using a fine-tuned deep learning model. Moreover, we also propose a new iterative matching framework to improve the annotation capability and work efficiency of the system. The framework expands the neighbor information on the knowledge graph and completes the matching of multiple cell targets through one query, which improves the work efficiency of the system. More importantly, the framework uses the results of column pair annotation to re-annotate target cells, check and correct cell annotation errors, and improve the system’s annotation capabilities. Our table annotation system achieves above-average results on two evaluation datasets from SemTab2021, and the extraction of real data features exceeds that of expert feature engineering. Moreover, our system also achieves good results on the real dataset of DBpedia.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call