Abstract

Social media plays an important role in facilitating conversations and news dissemination. Specifically, Twitter has recently seen use by investors to facilitate discussions surrounding stock exchange-listed companies. Investors depend on timely, credible information being made available in order to make well-informed investment decisions, with credibility being defined as the believability of information. Much work has been done on assessing credibility on Twitter in domains such as politics and natural disaster events, but the work on assessing the credibility of financial statements is scant within the literature. Investments made on apocryphal information could hamper efforts of social media’s aim of providing a transparent arena for sharing news and encouraging discussion of stock market events. This paper presents a novel methodology to assess the credibility of financial stock market tweets, which is evaluated by conducting an experiment using tweets pertaining to companies listed on the London Stock Exchange. Three sets of traditional machine learning classifiers (using three different feature sets) are trained using an annotated dataset. We highlight the importance of considering features specific to the domain in which credibility needs to be assessed for – in the case of this paper, financial features. In total, after discarding non-informative features, 34 general features are combined with over 15 novel financial features for training classifiers. Results show that classifiers trained on both general and financial features can yield improved performance than classifiers trained on general features alone, with Random Forest being the top performer, although the Random Forest model requires more features (37) than that of other classifiers (such as K-Nearest Neighbours − 9) to achieve such performance.

Highlights

  • Investments made on stock markets depend on timely and credible information being made available to investors

  • We propose a rich set of features divided into two groups – general features found in all tweets, regardless of subject matter, and financial features, which are engineered to assess the credibility of financial stock tweets

  • We present the results (Table 8) obtained from the experiment based on all of the features after the non-informative features are removed (34 general features (GF), 21 Financial features (FF)), and illustrate that some models’ performance suffers if feature selection techniques are not taken into consideration

Read more

Summary

Introduction

Investments made on stock markets depend on timely and credible information being made available to investors. Recent research has found that Twitter is becoming a hotbed for rumour propagation (Maddock et al, 2015) Such rumours and speculation on Twitter can be informative, as this can reflect investor mood and outlook (Ceccarelli et al, 2016), this new age of financial media in which discussions take place on social media demands mechanisms to assess the credibility of such posts. There has been no research on the credibility of financial stock-related tweets, work does exist on the credibility of tweets in areas such as politics (Sujoy Sikdar, Kang, O’donovan, Hollerer, & Adal, 2013; Page & Duffy, 2018), health (Bhattacharya et al, 2012), and natural disaster events (Yang et al, 2019; Thomson et al, 2012). This section will explore the related work on supervised learning approaches for determining credibility, due to its popularity versus unsupervised approaches

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call