Abstract

Although sentiment analysis on traditional online texts has been studied in depth, sentiment analysis for social media texts is still a challenging research direction. In the social media that contains a huge amount of texts and a large range of topics, it would be very difficult to manually collect enough labeled data to train a sentiment classifier for different domains. Distant supervision that considers emoticons as natural sentiment labels in the microblog texts has been widely used in social media sentiment analysis. However, the previous distant supervision works were normally trained based on an isolate set of data, and they were not capable to deal with the scenario where the texts are continuously increasing and the topics are constantly changing. To address such challenges, in this work we propose a distantly supervised lifelong learning framework for large-scale social media sentiment analysis. The key characteristic of our approach is continuous sentiment learning in social media. It learns on past tasks sequentially, retains the knowledge obtained from past learning and uses the past knowledge to help future learning. The lifelong sentiment classifier is trained on two large-scale distantly supervised social media datasets respectively, and evaluated on nine benchmark datasets. The results prove that our lifelong sentiment learning approach is feasible and effective to tackle the challenges of continuously updated texts with dynamic topics in social media. We also prove that the belief “the more training data the better performance” does not hold in large-scale social media sentiment analysis. In contrast, by conducting continuous learning from past tasks, our approach beats the traditional way of using all training data in one task, in terms of both classification performance and computational efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call