Abstract

The application of natural language processing (NLP) in sentiment analysis task by using textual data has wide scale application across various domains in plethora of industries. We have methodically studied pre-existing models and proposed new models for examining sentiment analysis task. The models proposed were analysed with three widely popular word embeddings separately and in combined approach using all embeddings as unique channels. We combined deep neural network models such as Bidirectional Long Short-Term Memory (BiLSTM) and Convolutional Neural Network (CNN) so that integrated models complement each other with their unique architectures. The word embeddings used had profound impact in accuracy of models owing to performative changes. The best word embedding was Word2Vec giving highest accuracy in almost all implemented models, followed by GloVe. FastText embedding performed consistently worse, giving much lower accuracy than other embeddings. We also observed that adding transformer encoder layers with CNN improves accuracy by 2% when compared to CNN without any transformer layers. An accuracy improvement of 2-3% over CNN-BiLSTM model was also observed by utilizing transformer encoder layer in conjunction with both BiLSTM and CNN. The proposed model achieved an accuracy of 89.04% on SST-2 dataset. We also compared larger pretrained language model used in sentiment analysis task with our proposed approach. The accuracy values obtained through combination of embeddings and models can be useful for other researchers when selecting word embeddings for their models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call