Abstract
Traditional text emotion analysis methods are primarily devoted to studying extended texts, such as news reports and full-length documents. Microblogs are considered short texts that are often characterized by large noises, new words, and abbreviations. Previous emotion classification methods usually fail to extract significant features and achieve poor classification effect when applied to processing of short texts or micro-texts. This study proposes a microblog emotion classification model, namely, CNN_Text_Word2vec, on the basis of convolutional neural network (CNN) to solve the above-mentioned problems. CNN_Text_Word2vec introduces a word2vec neural network model to train distributed word embeddings on every single word. The trained word vectors are used as input features for the model to learn microblog text features through parallel convolution layers with multiple convolution kernels of different sizes. Experiment results show that the overall accuracy rate of CNN_Text_Word2vec is 7.0% higher than that achieved by current mainstream methods, such as SVM, LSTM and RNN. Moreover, this study explores the impact of different semantic units on the accuracy of CNN_Text_Word2vec, specifically in processing of Chinese texts. The experimental results show that comparing to using feature vectors obtained from training words, feature vector obtained from training Chinese characters yields a better performance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.