We developed EmTract, an open-source tool designed to extract investor emotions from financial social media data. We contribute a novel dataset of 10,000 financial social media messages annotated with emotion labels and improve the DistilBERT model by incorporating 4861 tokens, including emojis and emoticons. This augmentation improved the model’s accuracy by over 3 percentage points compared to the standard BERT model, while providing faster inference and reduced computational requirements. Our models and datasets are publicly available to promote broader adoption and further research in financial sentiment analysis. We validated EmTract during the 2021 “meme stock” rally, where it accurately captured spikes in emotions such as anger and disgust following trading restrictions, demonstrating the model’s practical applicability in real-world events. Additionally, heterogeneity tests show that emotions have a stronger impact on smaller, more volatile, and heavily shorted stocks, aligning with established behavioral finance theories. These findings underscore the importance of integrating emotional dynamics into market analysis, particularly for speculative assets. Our contributions represent a significant step forward in understanding the psychological drivers of financial markets and offer practical tools for future research and industry applications.
Read full abstract