Onomatopoeia Sense Classification Algorithm Based on Dictionary Definitions

Michito Fujita,Kenji Araki,Yuzu Uchida

doi:10.3156/jsoft.35.1_501

Abstract

Many Japanese onomatopoeic words have multiple meanings, which are determined by the surrounding context. Previous studies have proposed automated word sense classification using vector representations of onomatopoeia obtained from pre-trained BERT model. Although this method relatively high level of performance, the annotation cost for creating training data is high, and it is difficult to prepare abundant training data for all onomatopoeia. In this paper, we propose a rule-based word sense classification method that utilizes usage examples contained in the onomatopoeia dictionary, aiming to automate word sense classification with low annotation cost. Experimental results show that the rules generated from the usages contribute to improve the accuracy of word sense classification.

Full Text