Bengali Information Retrieval System(BIRS)

Md Kowsher Md Kowsher,Skshohorab Ahmed,Imran Hossen

doi:10.5121/ijnlc.2019.8501

Md Kowsher Md Kowsher, Skshohorab Ahmed + Show 1 more

Open Access

https://doi.org/10.5121/ijnlc.2019.8501

Copy DOI

Abstract

Information Retrieval System is an effective process that helps a user to trace relevant information by Natural Language Processing (NLP). In this research paper, we have presented present an algorithmic Information Retrieval System(BIRS) based on information and the system is significant mathematically and statistically. This paper is demonstrated by two algorithms for finding out the lemmatization of Bengali words such as Trie and Dictionary Based Search by Removing Affix (DBSRA) as well as compared with Edit Distance for the exact lemmatization. We have presented the Bengali Anaphora resolution system using the Hobbs’ algorithm to get the correct expression of information. As the actionsof questions answering algorithms, the TF-IDF and Cosine Similarity are developed to find out the accurate answer from the documents. In this study, we have introduced a Bengali Language Toolkit (BLTK) and Bengali Language Expression (BRE) that make the easiest implication of our task. We have also developed Bengali root word’s corpus, synonym word’s corpus, stop word’s corpus and gathered 672 articles from the popular Bengali newspapers ‘The Daily Prothom Alo’ which is our inserted information. For testing this system, we have created 19335 questions from the introduced information and got 97.22% accurate answer.

Highlights

Information Retrieval (IR) refers to retrieve information from a collection of sources based on relevant query
We introduced a Bengali Information Retrieval Systems (BIRS) based on Bengali Natural Language Processing (BNLP)
For the implication of Bengali Informative Retrieval System (BIRS), we mainly described five types of corpus

Summary

INTRODUCTION

Information Retrieval (IR) refers to retrieve information from a collection of sources based on relevant query. A huge number of information are produced by newspapers, social networking sites and different kinds of websites Due to these large collections of digital documents in the web or local machine, finding the desired information is a tedious process. Finding relevant information based on query, has some challenges such as word mismatch that is a sentence can be made in different ways, their meaning is same but structure is different and a question can be formulated in different ways utilizing synonymuos words It is very challenging and difficult task to retrieve the desired information. BM25 ranking algorithm works well in different tasks [7] More advanced methods such as the Relevance-Based Language Models (or Relevance Models for short, RM) are the best-performing text retrieval ranking techniques [8 ]. Our main objective is to retrieve relevant information within a short time with great accuracy

RELATED WORK

PROPOSED WORK

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal on Natural Language Computing	Publication Date: Oct 31, 2019
Citations: 5	License type: cc-by

R Discovery Prime

R Discovery Prime

Bengali Information Retrieval System(BIRS)

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal on Natural Language Computing

Lead the way for us

Similar Papers

Bengali Informative Chatbot
Md Kowsher ... Md Rafiqul Islam
-
Md Kowsher, et. al.Md Kowsher ... Md Rafiqul Islam
01 Jul 2019
01 Jul 2019

Enhanced Text Retrieval Using Natural Language Processing
Elizabeth D Liddy
Bulletin of the American Society for Information Science and Technology | VOL. 24
Elizabeth D LiddyElizabeth D Liddy
01 Apr 1998
Bulletin of the American Society for Information Science and Technology | VOL. 24

Information Retrieval in Biomedicine: Natural Language Processing for Knowledge Integration
Martha F Earl
Journal of the Medical Library Association : JMLA | VOL. 98
Martha F EarlMartha F Earl
01 Apr 2010
Journal of the Medical Library Association : JMLA | VOL. 98

Bengali Stop Word Detection Using Different Machine Learning Algorithms
Jannatul Ferdousi Sohana ...
-
Jannatul Ferdousi Sohana, et. al.Jannatul Ferdousi Sohana ...
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bengali Information Retrieval System(BIRS)

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal on Natural Language Computing