Self-Attention Enhanced Recurrent Neural Networks for Sentence Classification

Ankit Kumar,Reshma Rastogi Nee Khemchandani

doi:10.1109/ssci.2018.8628865

Abstract

In this paper we propose self-attention enhanced Recurrent Neural Networks for the task of sentence classification. The proposed framework is based on Vanilla Recurrent Neural Network and Bi-directional Recurrent Neural Network architecture. These architectures have been implemented over two different recurrent cells namely Long Short-Term Memory and Gated Recurrent Unit. We have used the multi-head self-attention mechanism to improve the feature selection and thus preserve dependency over longer lengths in the recurrent neural network architectures. Further, to ensure better context development, we have used Mikolov’s pre-trained word2vec word vectors in both the static and non-static mode. To check the efficacy of our proposed framework, we have made a comparison of our models with the state-of-the-art methods of Yoon Kim on seven benchmark datasets. The proposed framework achieves a state-of-the-art result on four of the seven datasets and a performance gain over the baseline model on five of the seven datasets. Furthermore, to check the effectivity of self-attention on the task of sentence classification, we compare our self-attention based framework with Bahdanau’s attention based implementation from our previous work.

Full Text