Improved biomedical term selection in pseudo relevance feedback.

Muhammad Nabeel Asim,Waqar Mahmood,Muhammad Wasim,Muhammad Usman Ghani Khan

doi:10.1093/database/bay056

Abstract

Biomedical information retrieval systems are becoming popular and complex due to massive amount of ever-growing biomedical literature. Users are unable to construct a precise and accurate query that represents the intended information in a clear manner. Therefore, query is expanded with the terms or features that retrieve more relevant information. Selection of appropriate expansion terms plays key role to improve the performance of retrieval task. We propose document frequency chi-square, a newer version of chi-square in pseudo relevance feedback for term selection. The effects of pre-processing on the performance of information retrieval specifically in biomedical domain are also depicted. On average, the proposed algorithm outperformed state-of-the-art term selection algorithms by 88% at pre-defined test points. Our experiments also conclude that, stemming cause a decrease in overall performance of the pseudo relevance feedback based information retrieval system particularly in biomedical domain.Database URL: http://biodb.sdau.edu.cn/gan/

Highlights

Retrieving documents that match the user query is one of the foremost challenge in almost all information retrieval systems
We propose a new technique document frequency chi-square (DFC) and compare it with eight term selection algorithms including two different versions of chi-square proposed by Carpineto [11]
We have proposed a new term selection algorithm named as ‘DFC’ for query expansion (QE)

Summary

Introduction

Retrieving documents that match the user query is one of the foremost challenge in almost all information retrieval systems. In local QE, statistical information is used to find candidate expansion terms from corpus In this approach, documents are retrieved based on user query and top k retrieved documents are considered relevant. To select candidate expansion terms from top retrieved documents, different term selection techniques like chi-square, information gain (IG), Kullback–Leibler divergence (KLD) and dice are used. In global QE candidate expansion terms extracted from dictionaries may cause decrease in performance due to word ambiguity problem. If we have a query like ‘Which bank provides more profit?’, to expand this query, we will find synonyms of query terms from dictionaries In this query word ‘bank’ can be used in two different scenarios. We used mean average precision (MAP) to evaluate the integrity of presented algorithm on TREC 2006 Genomic [12] dataset

Related work

Methodology

Co-occurrence based query expansion

Experimental setup and results

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Database : the journal of biological databases and curation	Publication Date: Jan 1, 2018
Citations: 7	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Improved biomedical term selection in pseudo relevance feedback.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Database : the journal of biological databases and curation

Lead the way for us

Similar Papers

Pseudo relevance feedback with incremental learning for high level feature detection
... Jintao Li
-
, et. al. ... Jintao Li
01 Jun 2009
01 Jun 2009

BIR: Biomedical Information Retrieval System for Cancer Treatment in Electronic Health Record Using Transformers
Pir Noman Ahmad ... Khalid Khan
Sensors | VOL. 23
Pir Noman Ahmad, et. al.Pir Noman Ahmad ... Khalid Khan
23 Nov 2023
Sensors | VOL. 23

Tuning of Expansion Terms by PRF and WordNet Integrated Approach for AQE
Ramakrishna Kolikipogu ... B Padmaja Rani
-
Ramakrishna Kolikipogu, et. al.Ramakrishna Kolikipogu ... B Padmaja Rani
01 Jan 2013
01 Jan 2013

Biomedical Information Retrieval incorporating Knowledge Graph for Explainable Precision Medicine
Zuoxi Yang
-
Zuoxi YangZuoxi Yang
25 Jul 2020
25 Jul 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improved biomedical term selection in pseudo relevance feedback.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Database : the journal of biological databases and curation