Attention-Based CNN and Bi-LSTM Model Based on TF-IDF and GloVe Word Embedding for Sentiment Analysis

Marjan Kamyab,Guohua Liu,Michael Adjeisah

doi:10.3390/app112311255

Abstract

Sentiment analysis (SA) detects people’s opinions from text engaging natural language processing (NLP) techniques. Recent research has shown that deep learning models, i.e., Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Transformer-based provide promising results for recognizing sentiment. Nonetheless, CNN has the advantage of extracting high-level features by using convolutional and max-pooling layers; it cannot efficiently learn a sequence of correlations. At the same time, Bidirectional RNN uses two RNN directions to improve extracting long-term dependencies. However, it cannot extract local features in parallel, and Transformer-based like Bidirectional Encoder Representations from Transformers (BERT) are the computational resources needed to fine-tune, facing an overfitting problem on small datasets. This paper proposes a novel attention-based model that utilizes CNNs with LSTM (named ACL-SA). First, it applies a preprocessor to enhance the data quality and employ term frequency-inverse document frequency (TF-IDF) feature weighting and pre-trained Glove word embedding approaches to extract meaningful information from textual data. In addition, it utilizes CNN’s max-pooling to extract contextual features and reduce feature dimensionality. Moreover, it uses an integrated bidirectional LSTM to capture long-term dependencies. Furthermore, it applies the attention mechanism at the CNN’s output layer to emphasize each word’s attention level. To avoid overfitting, the Guasiannoise and GuasianDroupout are adopted as regularization. The model’s robustness is evaluated on four English standard datasets, i.e., Sentiment140, US-airline, Sentiment140-MV, SA4A with various performance matrices, and compared efficiency with existing baseline models and approaches. The experiment results show that the proposed method significantly outperforms the state-of-the-art models.

Highlights

Nowadays, people express their feelings and opinions to exchange their views using social media, such as Twitter, Facebook, Weibo, LinkedIn, and WeChat
We found that our proposed model has a significant accuracy of 94.01% for this dataset, with Recurrent Neural Network (RNN)-term frequency-inverse document frequency (TF-IDF) [34] remaining the lowest compared to all other models
This work presents a novel ACL-sentiment analysis (SA) model to tackle the lack of semantic information, high dimensionality, and overfitting problems

Summary

Introduction

People express their feelings and opinions to exchange their views using social media, such as Twitter, Facebook, Weibo, LinkedIn, and WeChat. Bi-LSTM is employed to extract the contextual information from the feature-generated CNN layers to perform the sentiment analysis. Since sentiment analysis is one of the valuable decision-making methods, most of the work has been done in sentiment classification using data mining, machine learning algorithms, and a knowledge based approach [12]. Lexicon sentiment is performed, and a product identification model builds to detect the comparative social media content They presented the essential advantages of the target product compared to its competitors. These approaches are usually less accurate [17,18] Another major challenge of sentiment analysis on machine learning and lexicon-based, including the hybrid method, is feature selection, typically domain-dependent. DL is known for the multiple representation learning levels in machine learning and has recently been applied to sentiment analysis with significant results [20]

Weighted Word Embedding for Sentiment Analysis

Deep Models for Sentiment Analysis

Proposed Architecture

Data Preprocessor

Weighted Word Representation

Attention Based Deep Layers

Full Connection and Output Layer

Experiments and Analysis

Datasets

Experimental Setup

Model Variation and Baselines Method

Results Analysis and Discussion

Analysis of Results on the Sentiment140 Dataset

Analysis of Results on the US-Airline Dataset

Analysis of Results on the Sentiment140-MV Dataset

Analysis of Results on the SD4A Dataset

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Nov 27, 2021
Citations: 42	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Attention-Based CNN and Bi-LSTM Model Based on TF-IDF and GloVe Word Embedding for Sentiment Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Deep learning approaches for extracting adverse events and indications of dietary supplements from clinical text.
Yadan Fan ... Sicheng Zhou
Journal of the American Medical Informatics Association | VOL. 28
Yadan Fan, et. al.Yadan Fan ... Sicheng Zhou
05 Nov 2020
Journal of the American Medical Informatics Association | VOL. 28

Comparative Analysis of Deep Learning Approaches for Twitter Text Classification
Lukesh Kadu
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 06
Lukesh KaduLukesh Kadu
21 Oct 2022
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 06

Chinese character relationship extraction based on BERT and CNN-BiGRU-ATT
Changyong Qi ... Yunqing Liu
-
Changyong Qi, et. al.Changyong Qi ... Yunqing Liu
23 Sep 2021
23 Sep 2021

SENTIMENT CLASSIFICATION OF E-COMMERCE REVIEWS BASED ON BERT-CNN
Chunhao Chai ... Tengze Mao
Asian Journal of Mathematics and Computer Research | VOL. -
Chunhao Chai, et. al.Chunhao Chai ... Tengze Mao
12 Dec 2022
Asian Journal of Mathematics and Computer Research | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Attention-Based CNN and Bi-LSTM Model Based on TF-IDF and GloVe Word Embedding for Sentiment Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences