Abstract
The integration of cloud resources with federated data retrieval has the potential of improving the maintenance, accessibility and performance of specialized databases in the biomedical field. However, such an integrative approach requires technical expertise in cloud computing, usage of a data retrieval engine and development of a unified data-model, which can encapsulate the heterogeneity of biological data. Here, a framework for the development of cloud-based biological specialized databases is proposed. It is powered by a distributed biodata retrieval system, able to interface with different data formats, as well as provides an integrated way for data exploration. The proposed framework was implemented using Java as the development environment, and MongoDB as the database manager. Syntactic analysis was based on BSON, jsoup, Apache Commons and w3c.dom open libraries. Framework is available in: http://nbel-lab.com and is distributed under the creative common agreement.
Highlights
The growing rate of biological data generation has produced unprecedented data streams, which regularly renovate our understanding of system biology [1], as well as alter our practice in healthcare [2]
Integrating cloud resources and federated data retrieval engine in the context of the development of specialized databases has the potential to enhance the constant development in databases in the biomedical field
While similar frameworks provide integration of some aspects of cloud resources with distributed search, they are primarily focusing on one specific arena (EnGene for example, is focusing on genomic data)
Summary
The growing rate of biological data generation has produced unprecedented data streams, which regularly renovate our understanding of system biology [1], as well as alter our practice in healthcare [2]. A distributed search engine is a decentralized service, allocating mining and query generation among numerous edges, integrating the retrieved results in a unified framework, constituting a federated database. We propose a cloud-based framework of a distributed search engine for biological data.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have