Stock-related news and articles on Vietnamese economic websites and blogs are rapidly increasing, but they are mixed with entertainment news, miscellaneous topics, and advertisements. This makes it annoy for real investors and analysts who only focus to find and analyze the stock-related information that matters (Boudoukh, 2013). This research introduces a novel method for automatically labeling the relevance of news articles to the stock market, based on a set of criteria derived from financial domain knowledge. In addition, this study also develops a deep learning classifier model that leverages the BERT architecture and the Vietnamese language model (viBERT) (Tran, 2020) to achieve high accuracy and efficiency in scoring the stock market news. This approach helps investors and analysts to filter out the irrelevant content on Vietnamese economic websites and access the most useful information for their mainstream analysis of stock movements.
Read full abstract