Abstract
Abstract Wordnet development is an active research area among NLP researchers. Since the manual construction of the English wordnet was very costly both in terms of time and human expertise, automatic approaches have become very popular for wordnet development in languages other than English. Automatic methods usually benefit from an existing wordnet of a high resource language and use it as the backbone of their work. In this article, we present an unsupervised approach for automatic wordnet construction using a combination of Expectation–Maximization and personalized PageRank algorithms. Our method uses some typical and available language resources, so it is applicable to many languages including under-resourced ones. The proposed method needs just a bilingual dictionary and a monolingual corpus for developing a wordnet. In order to evaluate the proposed method, we apply it to the Persian language which is identified as an under-resourced language in NLP tasks. Evaluation results properly indicate the power of the proposed method to construct a high quality and large-scale wordnet for poor-resource languages. According to experiments, we achieve a precision of higher than 93% with a recall of 50%.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.