Abstract

BackgroundThe goal of information integration in systems biology is to combine information from a number of databases and data sets, which are obtained from both high and low throughput experiments, under one data management scheme such that the cumulative information provides greater biological insight than is possible with individual information sources considered separately.ResultsHere we present PathSys, a graph-based system for creating a combined database of networks of interaction for generating integrated view of biological mechanisms. We used PathSys to integrate over 14 curated and publicly contributed data sources for the budding yeast (S. cerevisiae) and Gene Ontology. A number of exploratory questions were formulated as a combination of relational and graph-based queries to the integrated database. Thus, PathSys is a general-purpose, scalable, graph-data warehouse of biological information, complete with a graph manipulation and a query language, a storage mechanism and a generic data-importing mechanism through schema-mapping.ConclusionResults from several test studies demonstrate the effectiveness of the approach in retrieving biologically interesting relations between genes and proteins, the networks connecting them, and of the utility of PathSys as a scalable graph-based warehouse for interaction-network integration and a hypothesis generator system. The PathSys's client software, named BiologicalNetworks, developed for navigation and analyses of molecular networks, is available as a Java Web Start application at .

Highlights

  • The goal of information integration in systems biology is to combine information from a number of databases and data sets, which are obtained from both high and low throughput experiments, under one data management scheme such that the cumulative information provides greater biological insight than is possible with individual information sources considered separately

  • It is stored in the Schema Map Library and the data are ingested into PathSys warehouse through the Data Importer much like the bulk loading operation in a standard DBMS

  • Validation and insights from integration To show the impact of Molecular Interaction Graphs (MIGs) integration in understanding biology, we present a comparison between our results and those obtained from KEGG

Read more

Summary

Introduction

The goal of information integration in systems biology is to combine information from a number of databases and data sets, which are obtained from both high and low throughput experiments, under one data management scheme such that the cumulative information provides greater biological insight than is possible with individual information sources considered separately. Complex networks of molecular and genetic interactions are increasingly being studied for insights into biological mechanisms [1,2,3]. Such studies include deciphering genome-wide protein-protein interactions [4]], large-scale analysis and prediction of gene regulatory networks [5], construction of metabolic pathways [6], and development of synthetic genetic interaction networks [7,8]. Integrated analyses across multiple databases of different functionalities are still rare yet promising [13]. Such advances underscore the need to develop information management frameworks for adequate modeling of graph-structured data and graph-oriented operations [14,15]. In the absence of an efficient information management system that allows biologists to query discrete and large databases simultaneously, the full potential for functional genomics resources will remain under-utilized

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.