Abstract

BackgroundCentralized silos of genomic data are architecturally easier to initially design, develop and deploy than distributed models. However, as interoperability pains in EHR/EMR, HIE and other collaboration-centric life sciences domains have taught us, the core challenge of networking genomics systems is not in the construction of individual silos, but the interoperability of those deployments in a manner embracing the heterogeneous needs, terms and infrastructure of collaborating parties. This article demonstrates the adaptation of BitTorrent to private collaboration networks in an authenticated, authorized and encrypted manner while retaining the same characteristics of standard BitTorrent.ResultsThe BitTorious portal was sucessfully used to manage many concurrent domestic Bittorrent clients across the United States: exchanging genomics data payloads in excess of 500GiB using the uTorrent client software on Linux, OSX and Windows platforms. Individual nodes were sporadically interrupted to verify the resilience of the system to outages of a single client node as well as recovery of nodes resuming operation on intermittent Internet connections.ConclusionsThe authorization-based extension of Bittorrent and accompanying BitTorious reference tracker and user management web portal provide a free, standards-based, general purpose and extensible data distribution system for large ‘omics collaborations.

Highlights

  • Centralized silos of genomic data are architecturally easier to initially design, develop and deploy than distributed models

  • Taking an architectural lesson from data transfer solutions of decades past, spanning the emergence of decentralized distribution over the Internet, multicast UDP, even interplanetary networking ? it may be that the essential complexity [1] of globalized ? big data? fields is of sufficient magnitude that congruity may not emerge organically

  • While no approach is universally best, in the case of genomics data we argue that modifying individual purpose-specific data warehouses to expose a common, generic transfer layer based on common interaction standards is most suitable

Read more

Summary

Introduction

Centralized silos of genomic data are architecturally easier to initially design, develop and deploy than distributed models. As interoperability pains in EHR/EMR, HIE and other collaboration-centric life sciences domains have taught us, the core challenge of networking genomics systems is not in the construction of individual silos, but the interoperability of those deployments in a manner embracing the heterogeneous needs, terms and infrastructure of collaborating parties. This paper leverages the physically distributed properties of BitTorrent to create a massive logical data warehouse. Taking an architectural lesson from data transfer solutions of decades past, spanning the emergence of decentralized distribution (such as BitTorrent) over the Internet, multicast UDP, even interplanetary networking ? We present BitTorious as a free and open tool for private data syndication in translational genomics and other data intensive fields

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.