A bioinformatics knowledge discovery in text application for grid computing.

Marcello Castellano,Giuseppe Mastronardi,Roberto Bellotti,Gianfranco Tarricone

doi:10.1186/1471-2105-10-s6-s23

Marcello Castellano, Giuseppe Mastronardi + Show 2 more

Open Access

https://doi.org/10.1186/1471-2105-10-s6-s23

Copy DOI

Abstract

BackgroundA fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data. High performance computational infrastructures, such as Grid technologies, are emerging as a possible infrastructure to tackle the intensive use of Information and Communication resources in life science. The goal of this work was to develop a software middleware solution in order to exploit the many knowledge discovery applications on scalable and distributed computing systems to achieve intensive use of ICT resources.MethodsThe development of a grid application for Knowledge Discovery in Text using a middleware solution based methodology is presented. The system must be able to: perform a user application model, process the jobs with the aim of creating many parallel jobs to distribute on the computational nodes. Finally, the system must be aware of the computational resources available, their status and must be able to monitor the execution of parallel jobs. These operative requirements lead to design a middleware to be specialized using user application modules. It included a graphical user interface in order to access to a node search system, a load balancing system and a transfer optimizer to reduce communication costs.ResultsA middleware solution prototype and the performance evaluation of it in terms of the speed-up factor is shown. It was written in JAVA on Globus Toolkit 4 to build the grid infrastructure based on GNU/Linux computer grid nodes. A test was carried out and the results are shown for the named entity recognition search of symptoms and pathologies. The search was applied to a collection of 5,000 scientific documents taken from PubMed.ConclusionIn this paper we discuss the development of a grid application based on a middleware solution. It has been tested on a knowledge discovery in text process to extract new and useful information about symptoms and pathologies from a large collection of unstructured scientific documents. As an example a computation of Knowledge Discovery in Database was applied on the output produced by the KDT user module to extract new knowledge about symptom and pathology bio-entities.

Highlights

A fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data
The latter is in relation to the study of the knowledge discovery of bio-entities which included symptoms and pathologies contained in a collection of 5,000 documents
The software platform GATE was utilized for the knowledge discovery in text

Summary

Introduction

A fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data. Bio-entity recognition aims to identify and classify technical terms corresponding to the instances of concepts that are of interest to molecular biologists Examples of such entities include the names of proteins, genes, their locations of activity such as the names of cells or organisms, drugs, symptoms, pathologies and so on. Entity recognition is becoming increasingly important with the massive increase in reported results due to high throughput experimental methods It can be used in several higher level information access tasks such as relation extraction, summarization and question answering. With the large amount of genomic information being generated by biomedical researchers, it should not be surprising that in the genomics era, much of the work in biomedical name-entity recognition has focused on identifying gene and protein names in free text [1,2]

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC bioinformatics	Publication Date: Jun 1, 2009
Citations: 11	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

A bioinformatics knowledge discovery in text application for grid computing.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics

Lead the way for us

Similar Papers

Knowledge Discovery in Spatial Databases
Martin Ester ... Hans-Peter Kriegel
-
Martin Ester, et. al.Martin Ester ... Hans-Peter Kriegel
01 Jan 1998
01 Jan 1998

Knowledge Discovery in Spatial Databases
Martin Ester ... Jörg Sander
-
Martin Ester, et. al.Martin Ester ... Jörg Sander
01 Jan 1998
01 Jan 1998

Consequences and Strategic Implications of Networked Enterprise and Human Resources
Ana Isabel Jiménez-Zarco ... Óscar González-Benito
-
Ana Isabel Jiménez-Zarco, et. al.Ana Isabel Jiménez-Zarco ... Óscar González-Benito
01 Jan 2010
01 Jan 2010

Consequences and Strategic Implications of Networked Enterprise and Human Resources
Ana Isabel Jiménez-Zarco ... Óscar González-Benito
-
Ana Isabel Jiménez-Zarco, et. al.Ana Isabel Jiménez-Zarco ... Óscar González-Benito
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A bioinformatics knowledge discovery in text application for grid computing.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics