Automatically Building Financial Sentiment Lexicons While Accounting for Negation

Thomas Bos,Flavius Frasincar

doi:10.1007/s12559-021-09833-w

Thomas Bos, Flavius Frasincar

Open Access

https://doi.org/10.1007/s12559-021-09833-w

Copy DOI

Journal: Cognitive Computation	Publication Date: Feb 11, 2021
Citations: 8	License type: open-access

Affiliation: Erasmus University Rotterdam

Abstract

Financial investors make trades based on available information. Previous research has proved that microblogs are a useful source for supporting stock market decisions. However, the financial domain lacks specific sentiment lexicons that could be utilized to extract the sentiment from these microblogs. In this research, we investigate automatic approaches that can be used to build financial sentiment lexicons. We introduce weighted versions of the Pointwise Mutual Information approaches to build sentiment lexicons automatically. Furthermore, existing sentiment lexicons often neglect negation while building the sentiment lexicons. In this research, we also propose two methods (Negated Word and Flip Sentiment) to extend the sentiment building approaches to take into account negation when constructing a sentiment lexicon. We build the financial sentiment lexicons by leveraging 200,000 messages from StockTwits. We evaluate the constructed financial sentiment lexicons in two different sentiment classification tasks (unsupervised and supervised). In addition, the created financial sentiment lexicons are compared with each other and with other existing sentiment lexicons. The best performing financial sentiment lexicon is built by combining our Weighted Normalized Pointwise Mutual Information approach with the Negated Word approach. It outperforms all the other sentiment lexicons in the two sentiment classification tasks. In the unsupervised sentiment classification task, it has, on average, a balanced accuracy of 69.4%, and in the supervised setting, a balanced accuracy of 75.1%. Moreover, the various sentiment classification tasks confirm that the sentiment lexicons could be improved by taking into account negation while building the sentiment lexicons. The improvement could be made by using one of the proposed methods to incorporate negation in the sentiment lexicon construction process.

Highlights

The introduction of the Web made it possible and easy for users to express their opinions online
Thereafter, we look at the performance of the obtained financial sentiment lexicons in various sentiment classification tasks
If we look at the Negated Word (NW) category’s sentiment lexicons, we notice that all the evaluation metrics of the Bayes’ Theorem Benchmark (BTB), W-Pointwise Mutual Information (PMI), W-Normalized PMI (NPMI), and Inverse Category Frequency-based (ICF) sentiment lexicons have been improved

Summary

Introduction

The introduction of the Web made it possible and easy for users to express their opinions online. Sentiment lexicons play a key role in these approaches since most of them use a sentiment lexicon [4]. They can be constructed manually or automatically. Creating these lexicons ensures their high quality because they are made by language and domain experts. This process is time-cons uming, and those experts may not always be available. The coverage of the manually built sentiment lexicon is low. These disadvantages turned the focus to automatically building sentiment lexicons. Building a domain-specific sentiment lexicon is more challenging since words could have domain-specific meanings and sentiments

Methods

Results

Conclusion