Abstract

The Princeton WordNet® (PWN) is a widely used lexical knowledge database for semantic information processing. There are now many wordnets under creation for languages worldwide. In this paper, we endeavor to construct a wordnet for Pre-Qin ancient Chinese (PQAC), called PQAC WordNet (PQAC-WN), to process the semantic information of PQAC. In previous work, most recently constructed wordnets have been established either manually by experts or automatically using resources from which translation pairs between English and the target language can be extracted. The former method, however, is time-consuming, and the latter method, owing to a lack of language resources, cannot be performed on PQAC. As a result, a method based on word definitions in a monolingual dictionary is proposed. Specifically, for each sense, kernel words are first extracted from its definition, and the senses of each kernel word are then determined by graph-based Word Sense Disambiguation. Finally, one optimal sense is chosen from the kernel word senses to guide the mapping between the word sense and PWN synset. In this research, we obtain 66 % PQAC senses that can be shared with English and another 14 % language-specific senses that were added to PQAC-WN as new synsets. Overall, the automatic mapping achieves a precision of over 85 %.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.