Abstract

AbstractThe increase in the amount of available biomedical information has resulted in a higher demand on biomedical information retrieval systems. However, traditional information retrieval systems do not achieve the desired performance in this area. Query expansion techniques have improved the effectiveness of ranked retrieval by automatically adding additional terms to a query. In this work we test several automatic query expansion techniques using the Lemur Language Modelling Toolkit. The objective is to evaluate a set of query expansion techniques when they are applied to biomedical information retrieval. In the first step of the information retrieval searching, indexing, we compare the use of several techniques of stemming and stopwords. In the second step, matching, we compare the well-known weighting algorithms Okapi and TF-IDF BM25. The best results are obtained with the combination of Krovetz stemmer, SMART stopword list and TF-IDF. Moreover, we analyze the document retrieval based on Abstract, Title and Mesh fields. We conclude that seems more effective than looking at each of these fields individually. Also, we show that the use of feedback in document retrieval results a improvement in retrieving. The corpus used in the experiments was extracted from the biomedical text Cystic Fibrosis Corpus (CF).KeywordsQuery expansionBiomedical information retrievalLemurMEDLINE

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call