Examining Attention Mechanisms in Deep Learning Models for Sentiment Analysis

Spyridon Kardakis,Isidoros Perikos,Ioannis Hatzilygeroudis,Foteini Grivokostopoulou

doi:10.3390/app11093883

Spyridon Kardakis, Isidoros Perikos + Show 2 more

Open Access

https://doi.org/10.3390/app11093883

Copy DOI

Abstract

Attention-based methods for deep neural networks constitute a technique that has attracted increased interest in recent years. Attention mechanisms can focus on important parts of a sequence and, as a result, enhance the performance of neural networks in a variety of tasks, including sentiment analysis, emotion recognition, machine translation and speech recognition. In this work, we study attention-based models built on recurrent neural networks (RNNs) and examine their performance in various contexts of sentiment analysis. Self-attention, global-attention and hierarchical-attention methods are examined under various deep neural models, training methods and hyperparameters. Even though attention mechanisms are a powerful recent concept in the field of deep learning, their exact effectiveness in sentiment analysis is yet to be thoroughly assessed. A comparative analysis is performed in a text sentiment classification task where baseline models are compared with and without the use of attention for every experiment. The experimental study additionally examines the proposed models’ ability in recognizing opinions and emotions in movie reviews. The results indicate that attention-based models lead to great improvements in the performance of deep neural models showcasing up to a 3.5% improvement in their accuracy.

Highlights

In the past decade, the dramatic decrease in computation cost and the drastic increase in data availability led to the emergence of a new sub-field of machine learning, called deep learning, which outperforms its predecessor by achieving very high performance on large data
The results show that baseline methods like CNNs and Long short-term memory (LSTM) without the use of attention mechanisms report the lowest performance on all three datasets
We presented various attention-based models, including global-attention, self-attention and hierarchical-attention models

Summary

Introduction

The dramatic decrease in computation cost and the drastic increase in data availability led to the emergence of a new sub-field of machine learning, called deep learning, which outperforms its predecessor by achieving very high performance on large data. Utilizes deep artificial neural networks, such as CNNs, RNNs [1], LSTMs [2] and GRUs [3], which achieve remarkable performance in various domains, such as speech recognition [4,5,6], signal and EEG analysis [7], computer vision [8,9,10], emotion recognition [11,12,13], disease and cancer recognition [14] as well as in text classification and sentiment analysis [15,16,17]. In-text classification and sentiment analysis, deep neural networks have demonstrated quite remarkable performance [18]. Despite their interpretability disadvantages and their computational cost, deep neural networks can model complex nonlinear relationships.

Results

Discussion

Conclusion