Abstract

This article discusses a corpus-based method for the automatic identification of synonyms across different varieties of the same language. This method, based on the paradigm of distributional semantics, quantifies semantic similarity on the basis of contextual similarity in two comparable corpora. In two case studies for Dutch and German, we show that it automatically identifies the correct synonym for 31% and 25% of the target words, respectively. A manual error analysis moreover indicates that many additional synonyms are very close in the distributional model, while most other distributional neighbours are semantically related to the target word along other dimensions than synonymy. On the basis of these results, we argue that distributional-semantic methods can play a crucial role in the further evolution of corpus-based lexical semantics to a more quantitative discipline.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.