Abstract

In the era of information explosion, multi-sourced data may exist conflicts or even errors. To address this issue, plenty of truth discovery methods have been proposed to get trustworthy information from conflicting data. However, most existing truth discovery methods are designed for structured data and cannot meet the strong need to extract trustworthy information from raw text data. For text data, there are no completely correct or wrong answers, most answers may be partially correct. It is quite different from the situation of traditional truth discovery. In addition, tradition methods estimate the reliability of source based on plenty of observations provided. Unfortunately, for the scene of text truth discovery, it is not easy to get enough observations for sources. Besides, traditional methods ignore the importance of structure information and semantic information of text data, which leads to suboptimal results. To solve these challenges, we propose a Graph Convolutional Network (GCN) based truth discovery model to discover trustworthy information from text data. Firstly, Smooth Inverse Frequency (SIF) is utilized to learn real-valued vector representations for text data. Then, we construct undirected graph with these vectors to capture the structure information of answers. After that, the GCN is used to store and update the reliability of these answers, which sums up all the feature vectors of all neighboring answers to improve the accuracy and efficiency of truth discovery. Different from traditional methods, we use vectors to store the reliability of answers which have higher representation capability compared with real numbers, and network is adopted to capture complex relationships between answers rather than simplified functions. The experiment results on real datasets show that though text data structures are complex, our model can still find reliable answers compared with retrieval-based and state-of-the-art approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call