DVID: Distributed Versioned Image-Oriented Dataservice.

William T Katz,Stephen M Plaza

doi:10.3389/fncir.2019.00005

Abstract

Open-source software development has skyrocketed in part due to community tools like github.com, which allows publication of code as well as the ability to create branches and push accepted modifications back to the original repository. As the number and size of EM-based datasets increases, the connectomics community faces similar issues when we publish snapshot data corresponding to a publication. Ideally, there would be a mechanism where remote collaborators could modify branches of the data and then flexibly reintegrate results via moderated acceptance of changes. The DVID system provides a web-based connectomics API and the first steps toward such a distributed versioning approach to EM-based connectomics datasets. Through its use as the central data resource for Janelia's FlyEM team, we have integrated the concepts of distributed versioning into reconstruction workflows, allowing support for proofreader training and segmentation experiments through branched, versioned data. DVID also supports persistence to a variety of storage systems from high-speed local SSDs to cloud-based object stores, which allows its deployment on laptops as well as large servers. The tailoring of the backend storage to each type of connectomics data leads to efficient storage and fast queries. DVID is freely available as open-source software with an increasing number of supported storage options.

Highlights

Generation of a connectome from high-resolution imagery is a complex process currently ratelimited by the quality of automated segmentation and time-consuming manual “proofreading,” which entails examination of labeled image volumes and correction of errors (Zhao et al, 2018)
The DVID system is a highly customizable, open-source dataservice that directly addresses the issues encountered by image-driven connectomics research
Since a detailed exploration of each data type is beyond the scope of this paper, we provide a sampling of the Science API in Table 1 and refer readers to the embedded data type documentation in the DVID github repository

Summary

INTRODUCTION

Generation of a connectome from high-resolution imagery is a complex process currently ratelimited by the quality of automated segmentation and time-consuming manual “proofreading,” which entails examination of labeled image volumes and correction of errors (Zhao et al, 2018). The use of storage via a key-value interface allows us to exploit a spectrum of caching and storage systems including in-memory stores, embedded databases, distributed databases, and cloud data services. DVID introduces the idea of typed data instances that provide a high-level Science API, translate data requirements to keyvalue representations, and allow mapping types of data to different storage and caching systems. Over the course of its use, we added a number of features driven by reconstruction demands including multi-scale segmentation, regions of interest, automatic ranking of labels by synapse count, supervoxel and label map support that provides quick merge/split operations, and a variety of neuron representations with mechanisms for updating those denormalizations when associated volumes change. This paper discusses some of the issues and interesting benefits that we discovered in using a branched versioning system for our research

SYSTEM DESIGN

Example Usage

Versioned Data

Branched Versioning of Key-Value Data

Data Types

Versioning 3D Label Data

Storage Backends

Availability

RELATED WORK

FUTURE WORK

CONCLUSIONS

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in neural circuits	Publication Date: Feb 5, 2019
Citations: 31	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

DVID: Distributed Versioned Image-Oriented Dataservice.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in neural circuits

Lead the way for us

Similar Papers

How do open source software (OSS) developers practice and perceive requirements engineering? An empirical study
Jaison Kuriakose ... Jeffrey Parsons
-
Jaison Kuriakose, et. al.Jaison Kuriakose ... Jeffrey Parsons
24 Aug 2015
24 Aug 2015

Usability Innovations in OSS Development – Examining User Innovations in an OSS Usability Discussion Forum
Netta Iivari
-
Netta IivariNetta Iivari
01 Jan 2009
01 Jan 2009

Open Source Software Development: Minitrack Introduction
K Crowston ... H Annabi
-
K Crowston, et. al.K Crowston ... H Annabi
03 Jan 2005
03 Jan 2005

A Methodological Framework for Socio-Cognitive Analyses of Collaborative Design of Open Source Software
Warren Sack ... Dilan Mahendran
Computer Supported Cooperative Work (CSCW) | VOL. 15
Warren Sack, et. al.Warren Sack ... Dilan Mahendran
01 Jun 2006
Computer Supported Cooperative Work (CSCW) | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DVID: Distributed Versioned Image-Oriented Dataservice.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in neural circuits