The integration of Deep Neural Networks (DNN) with Natural Language Processing (NLP) technologies has opened new avenues in financial market prediction, particularly through the utilization of textual information. This study represents a significant advancement, which offers two primary contributions to stock trend prediction: (i) the exploitation of textual data (news, comments, microblogs) using advanced DNN architectures, enhancing market information utilization; (ii) significant improvement of the accuracy of predicting the direction of stock volatility by integrating textual and neural network technologies. Meanwhile, we have crawled, filtered, and constructed a large-scale microblog dataset. This dataset includes approximately 114,992 microblog textual data from 40 Science and Technology Innovation Board (STIB) companies in China during 2021. We conducted a comprehensive analysis using various DNN techniques, including Feedback Neural Networks (FNN), Supervised Contrastive Learning (SCL), Cross Entropy (CE), and Dual Contrastive Learning (DualCL), in conjunction with bag of words models, BERT, and Roberta compilers. Our findings reveal that the SCL method, when combined with microblog data, significantly increases prediction accuracy, particularly during the COVID-19 period. Furthermore, we discovered that using a cross-stock dataset enhances the accuracy of all prediction methods, and random allocation of microblog data leads to better results than sequential allocation. Additionally, we compared the efficacy of traditional models like the CAPM, three-factor, and five-factor models against neural network-based methods. Our results suggest a notable superiority of the SCL method in increasing prediction accuracy. Finally, applying our findings to real-world trading strategies, we demonstrated the practical advantages of using the SCL method in trading, evidenced by significant improvements across all performance indicators.
Read full abstract