Abstract

The rapid growth of social networks has produced an unprecedented amount of user-generated data, which provides an excellent opportunity for text mining. Sentiment analysis, an important part of text mining, attempts to learn about the authors’ opinion on a text through its content and structure. Such information is particularly valuable for determining the overall opinion of a large number of people. Examples of the usefulness of this are predicting box office sales or stock prices. One of the most accessible sources of user-generated data is Twitter, which makes the majority of its user data freely available through its data access API. In this study we seek to predict a sentiment value for stock related tweets on Twitter, and demonstrate a correlation between this sentiment and the movement of a company’s stock price in a real time streaming environment. Both n-gram and “word2vec” textual representation techniques are used alongside a random forest classification algorithm to predict the sentiment of tweets. These values are then evaluated for correlation between stock prices and Twitter sentiment for that each company. There are significant correlations between price and sentiment for several individual companies. Some companies such as Microsoft and Walmart show strong positive correlation, while others such as Goldman Sachs and Cisco Systems show strong negative correlation. This suggests that consumer facing companies are affected differently than other companies. Overall this appears to be a promising field for future research.

Highlights

  • Over the last several years there has been an explosion of growth and new activity in social networking

  • There is an incredible amount of useful information about individual opinions, feelings, and relationships contained in these transactions, but the loosely structured nature of human communication makes harnessing this data a challenge

  • There has been a significant amount of research into text analysis, including sentiment analysis, as well as some interest in utilizing these tools for prediction through Twitter, up until now these projects have primarily worked with text analysis and sentiment prediction more generally

Read more

Summary

Introduction

Over the last several years there has been an explosion of growth and new activity in social networking. In order to make sense of the large portion of this data which is text-based, Natural Language Processing tools can be used to rigorously categorize user generated text One of these tools for determining useful information from massive data sources such as Twitter is a sentiment analysis. Text analysis and sentiment detection could provide an insight for investor and general public opinion on a company and its stock price on a large scale This insight could provide more information for use in analysis techniques similar to those currently supported by technical analysis. This could be a promising method for determining the relationship between human evaluations and stock price apart from the apparent underlying values of companies uncovered by fundamental analysis

Related Work
Textual Representation
Sentiment Analysis on Twitter
Predicting the Behavior of a Population Based on Sentiment
Data and Methods
Twitter and Stock Price Data Collection
Correlation Analysis on Stock Price and Tweet Sentiment
Sentiment Classification Results
Price and Sentiment Correlation Analysis Results

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.