Abstract

As of late, various deep learning techniques and methods have shown their superiority to feature-based and shallow learning techniques in the field of open-domain question–answering systems (OpenQAS). However, only a few works adopted these techniques to build Arabic OpenQAS that can extract exact answers from large information sources (e.g., Wikipedia). In addition, no available Arabic OpenQAS integrated a module to identify duplicate questions to accelerate response time and reduce computation cost. In this paper, we propose an Arabic OpenQAS (named DAQAS) based on deep learning methods. It consists of three components: (1) Dense Duplicate Question Detection which returns answers to questions that already have been answered; (2) Retriever based on BM25 and Query Expansion by neural text generation; and (3) Reader able to extract exact answers given a question and the retrieved passages that probably contains the answer. All components of our system integrate deep learning models, specially transformers-based techniques, which have scored state-of-the-art in different NLP fields. We performed several experiments with publicly available question answering datasets to show the effectiveness of our system. DAQAS obtained promising results and scored 21.77% Exact Match and 54.71% F1 score when using only top 5 retrieved passages.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call