Abstract

Korean is the native and official language spoken by Chinese-Korean people, and Weibo is a social media software with a huge number of users in China. Currently, there is few studies related to sentiment analysis of Korean-language Weibo texts posted by Chinese-Korean users. In this paper, we propose a sentiment classification method for Chinese-Korean Weibo based on pre-trained language model and transfer learning. Firstly, we crawled the Chinese-Korean Weibo data from Sina Weibo and label them with sentiment to get the Chinese-Korean Weibo sentiment analysis (CKWSA) dataset. Secondly, to solve the problem of few training samples of the Chinese-Korean Weibo sentiment analysis dataset, we fine-tune the classifier based on the pre-trained Korean language model on the Korean Twitter sentiment analysis dataset to obtain the Korean Twitter sentiment classification model; and further fine-tune the model on CKWSA dataset to get Chinese-Korean Weibo sentiment classification model. The experiments show that the proposed classification method based on pre-trained language model and transfer learning has great performance, and there is an improvement compared other baselines on the Chinese-Korean Weibo sentiment analysis dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call