Abstract

This paper presents a method involving self-organizing monolingual semantic maps that are visible and continuous representations where Chinese or Japanese words with similar meanings are placed at the same or neighboring points so that the distance between them represents the semantic similarity. We used the self-organizing map, SOM, as a self-organizing device. The words to be self-organized are defined by sets of co-occurring words collected from Chinese or Japanese newspapers, according to their grammatical relationships. The words are then coded into vectors to be forwarded to the SOM, taking into account the semantic correlation between them, which is established using a form of word-similarity computation. The self-organized monolingual semantic maps are assessed by numerical evaluations of accuracy, recall, and the F-measure, as well as by intuition, and by the comparisons with a clustering method and with multivariate statistical analysis. This paper further discusses the possibility that the method we propose can be extended to constructing Japanese–Chinese bilingual semantic maps, with the aim of providing a semantics-based approach to word alignment in Japanese–Chinese parallel corpora. We also show the effectiveness of this extended method through small-scale comparative experiments with a baseline method, where the alignment of Japanese and Chinese words is directly determined through the Euclidean distance of vectors representing the words, with a clustering method, and with multivariate statistical analysis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call