Abstract

Web tables are valuable resources that can enrich existing knowledge bases. Researchers try to make the predictions of columns and column pairs in web tables into entities and relations of knowledge bases, which are known as the Column Type Annotation (CT) and Relation Extraction (RE) tasks in web table interpretation. Learning useful and accurate column vector representations plays the central role in solving the two tasks. A column’s semantics can be determined by three kinds of relationships: Column-Cell, Column-Table, Column-Column relationships. Existing works only rely on the Column-Cell relationship to determine the semantics of columns and result in suboptimal performances. In this paper, we propose to solve the above CT and RE tasks with the heterogeneous graph neural network technique. First, we construct heterogeneous graphs for web tables. Next, a new model named Tab-HGNN is proposed to consider all the three kinds of relationships and learn target column representations. Extensive experiments on real-world web table datasets demonstrate the effectiveness of the proposed Tab-HGNN model on the CT and RE tasks. It outperforms the competitive baselines by achieving up to +7.9% and +1.6% macro F <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</inf> scores, +2.6% and +0.2% weighted F <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</inf> scores on two CT datasets, respectively. It also achieves +0.3% macro F <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</inf> scores, +0.1% weighted F <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</inf> scores on one RE dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call