Abstract

Traditional Relation Extraction (RE) trains individual extractors for pre-defined relations. Open Relation Extraction (ORE) can eschew domain-specific training data, tackle an unbounded number of relations, and scale up to massive and heterogeneous corpus such as the web. However, It is difficult to process micro log texts: the genre is noisy, utterances are very short, and texts have little context. As such, conventional RE approaches fail when faced with micro log and other Web texts. In this paper, we present a gravitation-based Chinese ORE approach to extracting relations between entities by using the idea of the law of universal gravitation. GCORE produces heuristic rules to extract entity relations, and calculates a confidence score for each relational tuple, which is directly proportional to the product of the frequency of entity pairs and the frequency of relation words, and inversely proportional to the square of the distance between the relation word and the candidate entity pair. The confidence score can be used to show the reliability of relational tuples. The higher the score, the more reliable the candidate tuple is, and vice versa. The experimental evaluation over two data sets from ZORE demonstrates the correctness and effectiveness of our proposed approach, and empirical results on Weibo texts show the universality of GCORE.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call