User centered and ontology based information retrieval system for life sciences

Mohameth-François Sy,Vincent Ranwez,Jacky Montmain,Sylvie Ranwez,Armelle Regnault,Michel Crampes

doi:10.1186/1471-2105-13-s1-s4

Abstract

BackgroundBecause of the increasing number of electronic resources, designing efficient tools to retrieve and exploit them is a major challenge. Some improvements have been offered by semantic Web technologies and applications based on domain ontologies. In life science, for instance, the Gene Ontology is widely exploited in genomic applications and the Medical Subject Headings is the basis of biomedical publications indexation and information retrieval process proposed by PubMed. However current search engines suffer from two main drawbacks: there is limited user interaction with the list of retrieved resources and no explanation for their adequacy to the query is provided. Users may thus be confused by the selection and have no idea on how to adapt their queries so that the results match their expectations.ResultsThis paper describes an information retrieval system that relies on domain ontology to widen the set of relevant documents that is retrieved and that uses a graphical rendering of query results to favor user interactions. Semantic proximities between ontology concepts and aggregating models are used to assess documents adequacy with respect to a query. The selection of documents is displayed in a semantic map to provide graphical indications that make explicit to what extent they match the user's query; this man/machine interface favors a more interactive and iterative exploration of data corpus, by facilitating query concepts weighting and visual explanation. We illustrate the benefit of using this information retrieval system on two case studies one of which aiming at collecting human genes related to transcription factors involved in hemopoiesis pathway.ConclusionsThe ontology based information retrieval system described in this paper (OBIRS) is freely available at: http://www.ontotoolkit.mines-ales.fr/ObirsClient/. This environment is a first step towards a user centred application in which the system enlightens relevant information to provide decision help.

Highlights

Because of the increasing number of electronic resources, designing efficient tools to retrieve and exploit them is a major challenge
The ontology based information retrieval system described in this paper (OBIRS) is freely available at: http://www.ontotoolkit.mines-ales.fr/ObirsClient/
First we perform experiments to determine the impact of similarity measurement using the MuchMore collection [50] and secondly we use OBIRS in a use case dedicated to gene identification

Summary

Introduction

Because of the increasing number of electronic resources, designing efficient tools to retrieve and exploit them is a major challenge. As the number of electronic resources grows it is crucial to profit from powerful tools to index and retrieve documents efficiently This is true in life sciences where new technologies, such as DNA chips a decade ago and Generation Sequencing today, sustain the exponential growth of available resources. Though most IR systems rely on ontologies, they often use one of the two following extreme approaches: either they use most of the semantic expressiveness of the ontology and require complex query languages that are not really appropriate for non specialists; or they provide very simple query language that almost reduces the ontology to a dictionary of synonyms used in Boolean retrieval models [1] Another drawback of most IR systems is the lack of expressiveness of their results. In the absence of any justification concerning the results of IR systems, users may be confused and may not know how to modify their query satisfactorily in an iterative search process

Methods

Results

Conclusion