Boosting Arabic Named-Entity Recognition With Multi-Attention Layer

Mohammed Nadher Abdo Ali,Aamir Hussain,Guanzheng Tan

doi:10.1109/access.2019.2909641

Mohammed Nadher Abdo Ali, Aamir Hussain + Show 1 more

Open Access

https://doi.org/10.1109/access.2019.2909641

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 24	License type: cc-by-nc-nd

Affiliation: Central South University

Abstract

Sequence labeling models with recurrent neural network variants, such as long short-term memory (LSTM) and gated recurrent unit (GRU), show promising performance on several natural language processing (NLP) problems, including named-entity recognition (NER). Most existing models utilize word embeddings for capturing similarities between words. However, they lag when handling previously unobserved or infrequent words. Moreover, the attention mechanism has been used to improve sequence labeling tasks. In this paper, we propose an efficient multi-attention layer system for the Arabic named-entity recognition (ANER) task. In addition to word-level embeddings, we adopt character-level embeddings and combine them via an embedding-level attention mechanism. The output is fed into an encoder unit with bidirectional-LSTM, followed by another self-attention layer that is used to boost the system performance. Our model achieves approximately matched F1 score of 91% on the “ANERCorpus.” The overall experimental results demonstrate that our method is superior to other systems. Our approach using multi-layer attention mechanism yields a new state-of-the-art result for the ANER.

Full Text