Abstract

Aiming at the problem that the existing methods in the big data environment cannot extract the emotional features of microblog sufficiently and the average accuracy of analysis results is low, a microblog emotion analysis method using deep learning in spark big data environment is proposed. First, the Jieba word segmentation method is used to process text comments, so as to reduce the interference of irregular grammar and nonstandard words on the emotion analysis task of microblog text. Then, features based on affective rules, unary word features, syntactic features, and dependent word collocation features are selected. In order to prevent the dimension disaster caused by excessive feature dimensions, the feature selection method of information gain is used to reduce the dimension of features. Finally, a microblog emotion analysis method based on deep belief network (DBN) is established, and the DBN is parallelized through spark cluster to shorten the training time. Experiments show that when the feature set is composed of TOP2000 features, the classification accuracy of the fusion of four features is 90.94%, which is higher than that of the comparison method. In addition, the training time of DBN algorithm parallelized by spark cluster is only 27.78% of that of single machine. Therefore, compared with the comparison method, the proposed method can significantly improve the performance of the microblog emotion analysis system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call