Abstract

BackgorundThe completion of the Human Genome Project has resulted in large quantities of biological data which are proving difficult to manage and integrate effectively. There is a need for a system that is able to automate accesses to remote sites and to "understand" the information that it is managing in order to link data properly. Workflow management systems combined with Web Services are promising Information and Communication Technologies (ICT) tools. Some have already been proposed and are being increasingly applied to the biomedical domain, especially as many biology-related Web Services are now becoming available. Information on biological resources and on genomic sequences mutations are two examples of very specialized datasets that are useful for specific research domains.ResultsThe architecture of a system that is able to access and execute predefined workflows is presented in this paper. Web Services allowing access to the IARC TP53 Mutation Database and CABRI catalogues of biological resources have been implemented and are available on-line. Example workflows which retrieve data from these Web Services have also been created and are available on-line.ConclusionWe present a general architecture and some building blocks for the implementation of a system that is able to remotely execute workflows of biomedical interest and show how this approach can effectively produce useful outputs. The further development and implementation of Web Services allowing access to an exhaustive set of biomedical databases and the creation of effective and useful workflows will improve the automation of in-silico analysis.

Highlights

  • The Human Genome Project has transformed biology

  • The further development and implementation of Web Services allowing access to an exhaustive set of biomedical databases and the creation of effective and useful workflows will improve the automation of in-silico analysis

  • We present a set of Web Services which allows the researcher to search and retrieve information from Common Access to Biological Resources and Information (CABRI) catalogues of biological resources and our implementation of the International Agency for the Research on Cancer (IARC) TP53 Mutation Database

Read more

Summary

Introduction

The Human Genome Project has transformed biology. Since its completion, the field has expanded to the management, processing, analysis and visualization of large quantities of data from genomics, proteomics, medicinal chemistry and drug screening.This huge amount of data and the heterogeneity of software tools that are used for its distribution make the tasks of searching, retrieving and integrating the information very difficult. The Human Genome Project has transformed biology. The field has expanded to the management, processing, analysis and visualization of large quantities of data from genomics, proteomics, medicinal chemistry and drug screening. This huge amount of data and the heterogeneity of software tools that are used for its distribution make the tasks of searching, retrieving and integrating the information very difficult. Data integration needs stability that in turn can be provided by standardization of information systems, well established domain knowledge, well defined data and well defined goals. Data integration in biology suffers from heterogeneous data and sys-

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call