Sentiment analysis is a significant research area within the field of natural language processing, with applications in various social contexts. Recent advances in related research have led to notable improvements in accuracy, yet existing text sentiment analysis models continue to face challenges in terms of interpretability. This paper proposes a text sentiment classification method integrating BERT and BiLstm, comprising three stages: training, signature generation, and testing. Our primary contribution lies in the signature generation and testing stages. In the training stage, BERT is employed to generate word embeddings, while BiLstm is utilized for learning to construct the text sentiment classification model. In the signature generation stage, the training and validation sets are passed through the aforementioned classification model, the feature vectors are extracted following their passage through the fully-connected layer, and the feature vectors are utilized to generate the signature generation stage. The signature is then stored in the database. In the Testing Stage, the aforementioned operation is repeated on the test set in order to obtain the signatures. The signatures are then matched in the constructed signature database in order to obtain the final classification results. The matching degree is subsequently used as a threshold in order to be employed for subsequent evaluation of the results. The method enhances the interpretability of the classification task by matching through signatures. Two datasets, SST-2 and IMDB, were subjected to experimentation. Four quantitative metrics were employed for the evaluation of the results: accuracy, recall, precision, and F1-score. With a threshold of 0.99, the models optimal accuracy, recall, precision, and F1-score results reached 0.896, 1, 0.896, 0.945 and 0.847, 0.999, 0.848, 0.917, respectively.
Read full abstract