Abstract

With the continuous development of natural language processing, Relation extraction (RE) has been intensively studied and well performed in extracting relations from unstructured texts in both English and modern Chinese. In this paper, we study to extract relations from a special type of text, that is, Chinese textual description of Han Dynasty Stone Reliefs (HanDSR). We aim to develop an efficient relation extractor for special interests with a small number of samples. The problem is challenging due to the large number of rare words in the texts and the mixed-use of modern and ancient Chinese in the same sentence without a domain corpus. To address these problems, we propose a relation extraction method based on dependency parsing and utilize the information of HanDSR on the basic parser. To exploit the representation of dependency trees, we design five dependency semantic path patterns(DSPPs) to extract relation triples of special interests. Besides, we build the HanDSR Treebank that includes 4190 sentences, 28124 dependency trees, following the annotation format of the Penn Chinese Treebank 8.0, which addresses the lack of domain-specific corpus and could be used in extract relations from such texts. Extensive experiments on HanDSR dataset demonstrate the accuracy and efficiency of our solution. The experimental results illustrate that our proposal significantly outperforms the rule-based relation extraction model in both effectiveness and efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call