Abstract

Objective: This work aims to the development of Hadiyya language named entity recognition which is widely used in text summarization, machine translation, and information retrieval to categorizing and predicting tokens of a given corpus into predefined named entity classes Method : In this paper, a method combining Bidirectional Long Short-Term Memory neural network with Conditional Random Field (BiLSTM-CRF) is proposed to automatically recognize entities of Hadiyya language (Location, time, person, geography and other nonname entity) from annotated Hadiyya language corpus, the experiment in this work was conducted to discover the most suitable features for Hadiyya NER system. We have collected the data from Department of Hadiya Language & Literature (DHLL) at Wachemo University, Ethiopia. Hadiyya TV, and Hadiyya Media Network (HMN) Therefore, a newly annotated dataset having 5,148 instances is used for this study. We have used 70 % for training and 30% for testing Hadiyya NER system. Finding: after training and validating BiLSTM-CRF model using the collected dataset we have obtained a result of precision, recall and f1-measure values of 95.49%, 94.93%, and 95.21% respectively. Novelty: Finally, we have contributed by hybrid NER system in Hadiyya language to obtain state-of-the-art result which is independent of other natural language processing tasks. Keywords: Conditional Random Forest; Hadiyya Language; Long Short-Term Memory; Hadiyya Media Network

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call