Abstract

With the explosion of publicly accessible social data, sentiment analysis has emerged as an important task with applications in e-commerce, politics, and social sciences. Hence, so far, researchers have largely focused on sentiment analysis of texts involving entities such as products, persons, institutions, and events. However, a significant amount of chatter on microblogging websites may not be directed at a particular entity. On Twitter, users share information on their general state of mind, details about how their day went, their plans for the next day, or just conversational chatter with other users. In this paper, we look into the problem of assessing the sentiment of publicly available general stream of tweets. Assessing the sentiment of such tweets helps us assess the overall sentiment being expressed in a geographic location or by a set of users (scoped through some means), which has applications in social sciences, psychology, and health sciences. The only prior effort [1] that addresses this problem assumes equal proportion of positive, negative, and neutral tweets, but a casual observation shows that such a scenario is not realistic. So in our work, we first determine the proportion (with appropriate confidence intervals) of positive/negative/neutral tweets from a set of 1000 randomly curated tweets. Next, adhering to this proportion, we use a combination of an existing dataset [1] with our dataset and conduct experiments to achieve new state-of-the-art results using a large set of features. Our results also demonstrate that methods that work best for tweets containing popular named entities may not work well for general tweets. We also conduct qualitative error analysis and identify future research directions to further improve performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.