Abstract

AbstractEvery day, a massive amount of information is reported in the form of video, audio, or text through various media such as television, radio, social media, and web blogs. As the number of unstructured documents on those media has grown, finding relevant information has become more difficult. As a result, extracting relevant events from large amounts of unstructured text data is essential. We proposed an event extraction model, which aims to detect, classify and extract various types of events along with their arguments from Amharic text documents. In this paper, the researchers first come up with Amharic language-specific issues and then proposed Bidirectional Long Short Memory (BiLSTM) with a Word2vec model to detect and classify Amharic events from unstructured documents. To achieve this research 9,050 Amharic documents were used for event detection and extraction purpose. In addition to event detection and classification, the model also extracts event arguments that contain additional information about events such as Time and Place. The experimental results showed that the Bidirectional long short-term memory approach with Word2vec word embedding shows a promising result in terms of Amharic event detection and event classification, with 94% and 89% accuracy, respectively.KeywordsNatural language processingInformation extractionEvent extractionBidirectional long short-term memoryword2vec

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call