Sentiment Analysis (SA) is one of the most active research areas in the Natural Language Processing (NLP) field due to its potential for business and society. With the development of language representation models, numerous methods have shown promising efficiency in fine-tuning pre-trained language models in NLP downstream tasks. For Vietnamese, many available pre-trained language models were also released, including the monolingual and multilingual language models. Unfortunately, all of these models were trained on different architectures, pre-trained data, and pre-processing steps; consequently, fine-tuning these models can be expected to yield different effectiveness. In addition, there is no study focusing on evaluating the performance of these models on the same datasets for the SA task up to now. This article presents a fine-tuning approach to investigate the performance of different pre-trained language models for the Vietnamese SA task. The experimental results show the superior performance of the monolingual PhoBERT model and ViT5 model in comparison with previous studies and provide new state-of-the-art performances on five benchmark Vietnamese SA datasets. To the best of our knowledge, our study is the first attempt to investigate the performance of fine-tuning Transformer-based models on five datasets with different domains and sizes for the Vietnamese SA task.
Read full abstract