Lexical Sememe Prediction with RNN and Modern Chinese Dictionary

Mei Bai,Pin Lv,Xu Long

doi:10.1109/fskd.2018.8687260

Abstract

Knowledge base HowNet defines sememes as the minimum semantic units of words or phrases. Linguists have put many efforts into manually annotating sememes for words. Although automatically methods have been proposed to help solve this labor-intensive and time-consuming work of manually annotating sememes, the field is not mature enough. To the best of our knowledge, only three models have been proposed to solve automatically sememe prediction, and some input information and label structure are not fully used. We propose Sememe Prediction with Sentence Embedding and Chinese Dictionary (SPSECD), an end-to-end neural network which implements sentence embedding and sememe prediction in a unified model. To the best of our knowledge, SPSECD is the first model which treats words with polysemy differently on sememe prediction task. Before predicting sememes, our model adds the definition sentence from a word in a Chinese Dictionary, and we use Recurrent Neural Network to learning the embedding of the sentence. With the help of dictionary auxiliary information, our model can aware which meaning the word with polysemy focus on because of the different meanings of a word have a different definition in the Chinese Dictionary, and then our model can choose better sememes for the specific meaning of a word. Experiments show that our model achieves state-of-the-art performances.

Full Text