Abstract

MotivationBiomolecular data stored in public databases is increasingly specialized to organisms, context/pathology and tissue type, potentially resulting in significant overhead for analyses. These networks are often specializations of generic interaction sets, presenting opportunities for reducing storage and computational cost. Therefore, it is desirable to develop effective compression and storage techniques, along with efficient algorithms and a flexible query interface capable of operating on compressed data structures. Current graph databases offer varying levels of support for network integration. However, these solutions do not provide efficient methods for the storage and querying of versioned networks.ResultsWe present VerTIoN, a framework consisting of novel data structures and associated query mechanisms for integrated querying of versioned context-specific biological networks. As a use case for our framework, we study network proximity queries in which the user can select and compose a combination of tissue-specific and generic networks. Using our compressed version tree data structure, in conjunction with state-of-the-art numerical techniques, we demonstrate real-time querying of large network databases.ConclusionOur results show that it is possible to support flexible queries defined on heterogeneous networks composed at query time while drastically reducing response time for multiple simultaneous queries. The flexibility offered by VerTIoN in composing integrated network versions opens significant new avenues for the utilization of ever increasing volume of context-specific network data in a broad range of biomedical applications.Availability and Implementation VerTIoN is implemented as a C++ library and is available at http://compbio.case.edu/omics/software/vertion and https://github.com/tjcowman/vertionContacttyler.cowman@case.edu

Highlights

  • Networks are commonly used abstractions in computational and systems biology [1]

  • Biological networks include: (i) experimentally identified physical and functional interactions such as protein–protein interactions (PPIs) [31, 34], protein–DNA interactions, synthetic lethality and kinasesubstrate associations; (ii) statistically and computationally inferred associations, such as expression quantitative trait loci, co-expression networks, regulatory networks and gene-disease associations [3]; (iii) evolutionary relationships based on protein co-evolution, sequence homology, protein and domain families; and (iv) ontologies and functional associations that represent curated knowledge of metabolic and signaling pathways [35], Gene Ontology and phenotype ontologies

  • To demonstrate the utility of VerTIoN, we focus on network proximity queries in tissue-specific protein interaction and expression quantitative trait loci (eQTL) networks, since the utility of these algorithms in tissue-specific networks has been well established [27]

Read more

Summary

Introduction

Due to the wide variety of available biological network data, many specialized public databases have been developed to organize and serve data to researchers. Available query interfaces range from retrieval of neighbors of a given vertex (e.g. interacting partners of a given protein) to identification of sub-networks that are enriched in a given set of molecules, or identifying the interactions that are involved in a specific pathway. These queries are often supplemented by a front-end that supports visual exploration of these networks

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call