Bi-LSTM Model to Increase Accuracy in Text Classification: Combining Word2vec CNN and Attention Mechanism

Beakcheol Jang,Gaspard Harerimana,Myeonghwi Kim,Sang-Ug Kang,Jong Wook Kim

doi:10.3390/app10175841

Beakcheol Jang, Gaspard Harerimana + Show 3 more

Open Access

https://doi.org/10.3390/app10175841

Copy DOI

Journal: Applied Sciences	Publication Date: Aug 24, 2020
Citations: 230	License type: CC BY 4.0

Affiliation: Sangmyung University

Abstract

There is a need to extract meaningful information from big data, classify it into different categories, and predict end-user behavior or emotions. Large amounts of data are generated from various sources such as social media and websites. Text classification is a representative research topic in the field of natural-language processing that categorizes unstructured text data into meaningful categorical classes. The long short-term memory (LSTM) model and the convolutional neural network for sentence classification produce accurate results and have been recently used in various natural-language processing (NLP) tasks. Convolutional neural network (CNN) models use convolutional layers and maximum pooling or max-overtime pooling layers to extract higher-level features, while LSTM models can capture long-term dependencies between word sequences hence are better used for text classification. However, even with the hybrid approach that leverages the powers of these two deep-learning models, the number of features to remember for classification remains huge, hence hindering the training process. In this study, we propose an attention-based Bi-LSTM+CNN hybrid model that capitalize on the advantages of LSTM and CNN with an additional attention mechanism. We trained the model using the Internet Movie Database (IMDB) movie review data to evaluate the performance of the proposed model, and the test results showed that the proposed hybrid attention Bi-LSTM+CNN model produces more accurate classification results, as well as higher recall and F1 scores, than individual multi-layer perceptron (MLP), CNN or LSTM models as well as the hybrid models.

Highlights

There is an unprecedented deluge of text data due to increased internet use, resulting in generation of text data from various sources such as social media and websites
The Bi-long-term short memory (LSTM) neural network is composed of LSTM units that operate in both directions to incorporate past and future context information
We leverage the unique advantages of LSTM and convolution neural network (CNN), and we propose a hybrid model that uses for textfor feature extraction and a Bi-LSTM

Summary

Introduction

There is an unprecedented deluge of text data due to increased internet use, resulting in generation of text data from various sources such as social media and websites. Text data are unstructured and contain natural-language constructs, making it difficult to infer an intended message from the data This has led to increased research into the use of deep learning for natural-language-based sentiment classification and natural-language inference. LSTM is an improved recurring neural network (RNN) architecture that uses a gating mechanism consisting of an input gate, forget gate, and output gate [4]. The second set of results were obtained with variable data size and the proposed model was superior in terms of accuracy and F1 score.

Word2vec

Bi-LSTM

Structure

Attention Mechanism

Related Research

The Proposed Model

Sequence Embedding Layer

Bi-LSTM Attention Layer

Experiment

Dataset

Performance Evaluation Metrics

Results

Analysis

Future Works

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bi-LSTM Model to Increase Accuracy in Text Classification: Combining Word2vec CNN and Attention Mechanism

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Evaluation of Cryptocurrency Price Prediction Using LSTM and CNNs Models
Ng Shi Wen ... Lew Sook Ling
JOIV : International Journal on Informatics Visualization | VOL. 7
Ng Shi Wen, et. al.Ng Shi Wen ... Lew Sook Ling
30 Nov 2023
JOIV : International Journal on Informatics Visualization | VOL. 7

Comparison of Long Short-Term Memory and Convolutional Neural Network Models for Emergency Department Patients’ Arrival Daily Forecasting
Sina Moosavi Kashani ... Sanaz Zargar Balaye Jame
Journal of Archives in Military Medicine | VOL. 12
Sina Moosavi Kashani, et. al.Sina Moosavi Kashani ... Sanaz Zargar Balaye Jame
02 Mar 2024
Journal of Archives in Military Medicine | VOL. 12

A Novel Hybrid Deep Neural Network to Predict Pre-impact Fall for Older People Based on Wearable Inertial Sensors.
Xiaoqun Yu ... Hai Qiu
Frontiers in Bioengineering and Biotechnology | VOL. 8
Xiaoqun Yu, et. al.Xiaoqun Yu ... Hai Qiu
12 Feb 2020
Frontiers in Bioengineering and Biotechnology | VOL. 8

Evaluating a Comparing Deep Learning Architectures for Blood Glucose Prediction
Touria El Idrissi ... Ali Idri
-
Touria El Idrissi, et. al.Touria El Idrissi ... Ali Idri
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bi-LSTM Model to Increase Accuracy in Text Classification: Combining Word2vec CNN and Attention Mechanism

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences