Deep neural networks for Arabic information extraction

Abdelhalim Saadi,Hacene Belhadef

doi:10.1108/sasbe-03-2019-0031

Abstract

PurposeThe purpose of this paper is to present a system based on deep neural networks to extract particular entities from natural language text, knowing that a massive amount of textual information is electronically available at present. Notably, a large amount of electronic text data indicates great difficulty in finding or extracting relevant information from them.Design/methodology/approachThis study presents an original system to extract Arabic-named entities by combining a deep neural network-based part-of-speech tagger and a neural network-based named entity extractor. Firstly, the system extracts the grammatical classes of the words with high precision depending on the context of the word. This module plays the role of the disambiguation process. Then, a second module is used to extract the named entities.FindingsUsing deep neural networks in natural language processing, requires tuning many hyperparameters, which is a time-consuming process. To deal with this problem, applying statistical methods like the Taguchi method is much requested. In this study, the system is successfully applied to the Arabic-named entities recognition, where accuracy of 96.81 per cent was reported, which is better than the state-of-the-art results.Research limitations/implicationsThe system is designed and trained for the Arabic language, but the architecture can be used for other languages.Practical implicationsInformation extraction systems are developed for different applications, such as analysing newspaper articles and databases for commercial, political and social objectives. Information extraction systems also can be built over an information retrieval (IR) system. The IR system eliminates irrelevant documents and paragraphs.Originality/valueThe proposed system can be regarded as the first attempt to use double deep neural networks to increase the accuracy. It also can be built over an IR system. The IR system eliminates irrelevant documents and paragraphs. This process reduces the mass number of documents from which the authors wish to extract the relevant information using an information extraction system.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Deep neural networks for Arabic information extraction

Abstract

Talk to us

Similar Papers

More From: Smart and Sustainable Built Environment

Lead the way for us

Journal: Smart and Sustainable Built Environment	Publication Date: Apr 3, 2020
Citations: 5

Similar Papers

Using Continuous Integration to organize and monitor the annotation process of domain specific corpora
Marc Schreiber ... Bodo Kraft
-
Marc Schreiber, et. al.Marc Schreiber ... Bodo Kraft
01 Apr 2014
01 Apr 2014

Graph Neural Networks for Natural Language Processing: A Survey
Lingfei Wu ... Bo Long
-
Lingfei Wu, et. al.Lingfei Wu ... Bo Long
01 Jan 2023
01 Jan 2023

A generic framework for ontology-based information retrieval and image retrieval in web data
V Vijayarajan ... M Dinakaran
Human-centric Computing and Information Sciences | VOL. 6
V Vijayarajan, et. al.V Vijayarajan ... M Dinakaran
05 Nov 2016
Human-centric Computing and Information Sciences | VOL. 6

On the Role of Information Retrieval and Information Extraction in Question Answering Systems
Dan Moldovan ... Mihai Surdeanu
-
Dan Moldovan, et. al.Dan Moldovan ... Mihai Surdeanu
01 Jan 2003
01 Jan 2003

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep neural networks for Arabic information extraction

Abstract

Talk to us

Similar Papers

More From: Smart and Sustainable Built Environment